95% of GenAI Projects Are Failing – And It’s Not the Model’s Fault

Surajbhan Satpathy

December 8, 2025

95% of GenAI Projects Are Failing – And It’s Not the Model’s Fault

It’s rarely the model. It’s the plumbing: data quality, access control, scope control, and connectivity.

Building Production-Ready MCP Servers That Actually Scale: Our HTTP + Webhook Approach

How we reimagined the Model Context Protocol for enterprise deployments

The Problem with Standard MCP

The Model Context Protocol (MCP) is transforming how AI agents interact with external systems. It provides a standardized way for LLMs to access tools, read resources, and receive real-time updates. But there's a catch.

The standard MCP transport relies on Server-Sent Events (SSE) or stdio for communication. While these work great for desktop applications like Claude Desktop or single-user development tools, they create significant challenges at scale:

Persistent connections don't scale - Each client maintains a long-lived SSE connection, consuming server resources
No horizontal scaling - Connection state is tied to a specific server instance
Load balancer headaches - Sticky sessions become mandatory, defeating the purpose of load balancing
Third-party integration gap - Services like GitHub, Slack, and MongoDB use webhooks, not SSE

We needed something different. We needed MCP servers that could:

Scale horizontally with zero connection state
Integrate natively with third-party webhook providers
Deploy to Kubernetes with trivial multi-replica configurations
Maintain full MCP protocol compatibility

So we built mcp-http-webhook - and published it to npm.

Our Approach: HTTP + Webhooks

The core insight is simple: replace persistent connections with stateless HTTP requests and webhook callbacks.

Standard MCP:
  Client ←──SSE Connection──→ Server (persistent, stateful)

Our Approach:
  Client ──HTTP POST──→ Server ──Response──→ Client (stateless)
  Third-Party ──Webhook──→ Server ──Webhook──→ Client (event-driven)

How It Works

For Tools and Resources: Clients make standard HTTP POST requests to call tools or read resources. No connection to maintain.

# Call a tool
curl -X POST https://mcp.example.com/mcp \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc": "2.0",
    "method": "tools/call",
    "params": {
      "name": "find_documents",
      "arguments": { "collection": "users", "filter": {} }
    }
  }'

For Subscriptions (The Magic Part): Instead of the server pushing updates through SSE, clients provide a webhook URL when subscribing. Updates flow through webhooks.

1. Client subscribes, providing their webhook URL
2. Server registers with third-party service (e.g., GitHub, MongoDB)
3. Third-party sends events to our server's webhook endpoint
4. Our server processes and forwards to client's webhook URL

This architecture means:

Zero connection state - Any server instance can handle any request
Native third-party integration - GitHub webhooks, MongoDB change streams, Slack events all fit naturally
Trivial scaling - Add more pods, point load balancer, done

The Library: `mcp-http-webhook`

We packaged this approach into a production-ready TypeScript library. Here's what makes it special:

Built on the Official MCP SDK

We don't reinvent the protocol. We use @modelcontextprotocol/sdk under the hood, ensuring full compatibility with MCP clients and tools like MCP Inspector.

import { createMCPServer } from 'mcp-http-webhook';
import { RedisStore } from 'mcp-http-webhook/stores';

const server = createMCPServer({
  name: 'my-mcp-server',
  version: '1.0.0',
  publicUrl: 'https://mcp.example.com',
  store: new RedisStore(process.env.REDIS_URL),

  tools: [...],
  resources: [...],
});

await server.start();

External State Storage

Subscription data lives in Redis (or any key-value store you implement). This is what enables horizontal scaling - any server instance can look up any subscription.

interface KeyValueStore {
  get(key: string): Promise<string | null>;
  set(key: string, value: string, ttl?: number): Promise<void>;
  delete(key: string): Promise<void>;
  scan?(pattern: string): Promise<string[]>;
}

First-Class Webhook Support

Resources can define subscription handlers that integrate with any third-party webhook provider:

resources: [{
  uri: 'github://repo/{owner}/{repo}/issues',
  name: 'GitHub Issues',

  subscription: {
    // Called when a client subscribes
    onSubscribe: async (uri, subscriptionId, thirdPartyWebhookUrl, context) => {
      // Register webhook with GitHub pointing to thirdPartyWebhookUrl
      const webhook = await octokit.repos.createWebhook({
        owner, repo,
        config: { url: thirdPartyWebhookUrl }
      });

      return { thirdPartyWebhookId: webhook.id };
    },

    // Called when GitHub sends us a webhook
    onWebhook: async (subscriptionId, payload, headers) => {
      return {
        resourceUri: `github://repo/${owner}/${repo}/issues`,
        changeType: payload.action === 'opened' ? 'created' : 'updated',
        data: payload.issue
      };
    },

    // Called when client unsubscribes
    onUnsubscribe: async (uri, subscriptionId, storedData, context) => {
      await octokit.repos.deleteWebhook({ hook_id: storedData.thirdPartyWebhookId });
    }
  }
}]

Production Hardened

Automatic retry with exponential backoff for webhook delivery
Signature verification for incoming and outgoing webhooks (HMAC SHA-256)
Health check endpoints (/health, /ready)
Prometheus metrics integration
Graceful shutdown handling

Real-World Example: MongoDB MCP Server

Let's look at how we built a production MongoDB MCP server using this library. This server:

Exposes MongoDB operations as MCP tools
Lists collections as MCP resources
Provides real-time updates via MongoDB change streams
Falls back to polling when change streams aren't available

Architecture Overview

┌─────────────────────────────────────────────────────────┐
│                    MongoDB MCP Server                    │
├─────────────────────────────────────────────────────────┤
│  Tools: find, insert, update, delete, aggregate         │
│  Resources: Collections with schema inference           │
│  Subscriptions: Change streams + polling fallback       │
├─────────────────────────────────────────────────────────┤
│  mcp-http-webhook  │  MongoManager  │  ChangeTracker   │
├─────────────────────────────────────────────────────────┤
│          Redis Store (subscriptions, credentials)        │
└─────────────────────────────────────────────────────────┘

Defining Tools

MongoDB operations become MCP tools with JSON Schema validation:

tools: [
  {
    name: 'find_documents',
    description: 'Find documents in a collection using filter, projection, and sorting.',
    inputSchema: {
      type: 'object',
      properties: {
        collection: { type: 'string' },
        filter: { type: 'object', default: {} },
        projection: { type: 'object' },
        sort: { type: 'object' },
        limit: { type: 'number', default: 100 },
      },
      required: ['collection'],
    },
    handler: async (args, context) => {
      const credentials = await getCredentials(context.userId);
      const documents = await mongoManager.findDocuments(
        credentials,
        args.database,
        args.collection,
        args.filter,
        { projection: args.projection, sort: args.sort, limit: args.limit }
      );

      return { documents: sanitizeDocuments(documents) };
    },
  },
  // insert_documents, update_documents, delete_documents, aggregate_documents...
]

Defining Resources with Pagination

Collections are exposed as resources with built-in pagination:

resources: [
  {
    uri: 'mongodb://{connection}/collection/{database}/{collection}',
    name: 'MongoDB Collection',
    mimeType: 'application/json',

    // List all collections across all connections
    list: async (context, options) => {
      const connections = await listConnections(context.userId);
      const resources = [];

      for (const conn of connections) {
        const collections = await mongoManager.listCollections(conn.credentials);
        for (const coll of collections) {
          resources.push({
            uri: buildResourceUri(conn.name, conn.database, coll.name),
            name: `${conn.name}.${coll.name}`,
            metadata: await buildCollectionMetadata(coll),
          });
        }
      }

      // Apply pagination
      const page = options?.page ?? 1;
      const limit = options?.limit ?? 100;
      const offset = (page - 1) * limit;

      return {
        resources: resources.slice(offset, offset + limit),
        nextCursor: offset + limit < resources.length ? String(page + 1) : undefined,
        pagination: { page, limit, total: resources.length, hasMore: offset + limit < resources.length }
      };
    },

    // Read collection data
    read: async (uri, context, options) => {
      const { connection, database, collection } = parseUri(uri);
      const page = options?.pagination?.page ?? 1;
      const limit = options?.pagination?.limit ?? 100;

      const documents = await mongoManager.findDocuments(
        credentials, database, collection, {},
        { limit, skip: (page - 1) * limit }
      );

      return {
        contents: { data: documents.map(convertToRow), mimeType: 'application/json' },
        nextCursorUrl: hasMore ? `${uri}/page/${page + 1}` : undefined,
        pagination: { page, limit, total, hasMore }
      };
    },
  }
]

Change Stream Subscriptions

The real power comes from subscriptions. We use MongoDB change streams when available, with polling fallback:

subscription: {
  onSubscribe: async (uri, subscriptionId, webhookUrl, context) => {
    const { database, collection } = parseUri(uri);
    const credentials = await getCredentials(context.userId);

    // Start tracking changes
    await changeTracker.startTracking(
      subscriptionId,
      credentials,
      database,
      collection,
      webhookUrl,
      uri
    );

    return { thirdPartyWebhookId: subscriptionId };
  },

  onUnsubscribe: async (uri, subscriptionId) => {
    await changeTracker.stopTracking(subscriptionId);
  },

  onWebhook: async (subscriptionId, payload, headers) => {
    // Process incoming change events
    return {
      resourceUri: payload.resourceUri,
      changeType: payload.changeType,
      data: { rows: payload.rows, changes: payload.changes }
    };
  }
}

The ChangeTracker class handles the complexity:

class ChangeTracker {
  async startTracking(subscriptionId, credentials, database, collection, webhookUrl) {
    try {
      // Try change streams first
      const changeStream = await mongoManager.watchCollection(credentials, collection);

      changeStream.on('change', async (change) => {
        // Convert MongoDB change event to webhook payload
        await sendWebhook(webhookUrl, {
          resourceUri: buildDocumentUri(change.documentKey._id),
          changeType: change.operationType,
          rows: [change.fullDocument]
        });
      });

      changeStream.on('error', () => this.startFallback(/*...*/));
    } catch {
      // Fall back to polling for non-replica set deployments
      this.startFallback(subscriptionId, credentials, database, collection, webhookUrl);
    }
  }

  private async startFallback(/*...*/) {
    setInterval(async () => {
      const newDocs = await mongoManager.findDocuments(
        credentials, database, collection,
        { updatedAt: { $gt: lastTimestamp } }
      );

      if (newDocs.length > 0) {
        for (const doc of newDocs) {
          await sendWebhook(webhookUrl, {
            resourceUri: buildDocumentUri(doc._id),
            changeType: 'updated',
            rows: [doc]
          });
        }
      }
    }, POLLING_INTERVAL);
  }
}

Credential Discovery

A unique feature: we expose a .well-known/credentials endpoint that describes what credentials the MCP server needs:

credentials: [
  {
    name: 'mongo_uri',
    description: 'MongoDB connection URI',
    type: 'string',
    required: false,
    config: {
      type: 'keyValue',
      inject: [{ location: 'header', key: 'x-mongo-uri', value: '{{mongo_uri}}' }]
    }
  },
  {
    name: 'mongo_database',
    description: 'Default database name',
    type: 'string',
    required: true,
    config: {
      type: 'keyValue',
      inject: [{ location: 'header', key: 'x-mongo-database', value: '{{mongo_database}}' }]
    }
  },
  // More credential fields...
]

This allows orchestration systems to automatically configure connections without hardcoded integrations.

Building Your Own MCP with This Library

Ready to build your own production MCP server? Here's the step-by-step:

1. Install the Library

npm install mcp-http-webhook zod
npm install @modelcontextprotocol/sdk express  # peer dependencies
npm install ioredis  # for Redis store

2. Set Up Basic Server

import { createMCPServer } from 'mcp-http-webhook';
import { RedisStore } from 'mcp-http-webhook/stores';
import Redis from 'ioredis';

const redis = new Redis(process.env.REDIS_URL);
const store = new RedisStore(redis);

const server = createMCPServer({
  name: 'my-custom-mcp',
  version: '1.0.0',
  publicUrl: process.env.PUBLIC_URL, // Your public HTTPS URL
  port: 3000,
  store,

  // Optional: Add authentication
  authenticate: async (req) => {
    const token = req.headers.authorization?.replace('Bearer ', '');
    const user = await validateToken(token);
    return { userId: user.id };
  },

  tools: [],
  resources: [],
});

await server.start();

3. Add Tools

tools: [
  {
    name: 'my_tool',
    description: 'Does something useful',
    inputSchema: {
      type: 'object',
      properties: {
        input: { type: 'string', description: 'The input value' }
      },
      required: ['input']
    },
    handler: async (args, context) => {
      // context.userId available from authentication
      const result = await doSomething(args.input);
      return { result };
    }
  }
]

4. Add Resources with Subscriptions

resources: [
  {
    uri: 'myservice://{tenant}/data/{id}',
    name: 'My Data Resource',

    list: async (context) => {
      const items = await getItems(context.userId);
      return items.map(item => ({
        uri: `myservice://${item.tenant}/data/${item.id}`,
        name: item.name
      }));
    },

    read: async (uri, context) => {
      const { id } = parseUri(uri);
      const data = await getData(id);
      return { contents: { text: JSON.stringify(data) } };
    },

    // Add subscriptions if your service has webhooks or events
    subscription: {
      onSubscribe: async (uri, subscriptionId, webhookUrl, context) => {
        // Register with your service's webhook system
        const hookId = await registerWebhook(webhookUrl);
        return { thirdPartyWebhookId: hookId };
      },

      onWebhook: async (subscriptionId, payload, headers) => {
        return {
          resourceUri: payload.resourceUri,
          changeType: payload.type,
          data: payload.data
        };
      },

      onUnsubscribe: async (uri, subscriptionId, storedData) => {
        await deleteWebhook(storedData.thirdPartyWebhookId);
      }
    }
  }
]

5. Deploy

# docker-compose.yml
services:
  mcp-server:
    build: .
    ports:
      - "3000:3000"
    environment:
      - PUBLIC_URL=https://mcp.example.com
      - REDIS_URL=redis://redis:6379
    depends_on:
      - redis
    deploy:
      replicas: 3  # Scale horizontally!

  redis:
    image: redis:7-alpine

Key Differentiators

Feature	Standard MCP (SSE)	Our Approach (HTTP + Webhooks)
Scaling	Requires sticky sessions	Stateless, trivial horizontal scaling
Third-party integration	Custom adapters needed	Native webhook support
State management	In-memory per instance	External store (Redis)
Load balancing	Complex	Simple round-robin
Serverless deployment	Difficult	Possible
Protocol compatibility	Full	Full (uses official SDK)

Serverless & Scale-to-Zero Deployment

One of the most compelling advantages of our stateless HTTP approach is native compatibility with serverless platforms. Since there's no persistent connection state, MCP servers built with mcp-http-webhook can deploy to environments that scale to zero—meaning you only pay for actual usage.

AWS Lambda Deployment

AWS Lambda is ideal for MCP servers with variable traffic patterns:

// lambda.ts
import { createMCPServer } from 'mcp-http-webhook';
import { DynamoDBStore } from 'mcp-http-webhook/stores';
import { APIGatewayProxyHandler } from 'aws-lambda';

const store = new DynamoDBStore(process.env.DYNAMODB_TABLE!);

const server = createMCPServer({
  name: 'my-mcp-lambda',
  version: '1.0.0',
  publicUrl: process.env.API_GATEWAY_URL!,
  store,
  tools: [...],
  resources: [...],
});

export const handler: APIGatewayProxyHandler = async (event) => {
  return server.handleLambdaEvent(event);
};

Benefits:

Pay-per-invocation - No cost when idle
Automatic scaling - Handles traffic spikes without configuration
Managed infrastructure - No servers to maintain
Global deployment - Deploy to multiple regions with Lambda@Edge

Kubernetes Knative (Scale-to-Zero)

For teams already on Kubernetes, Knative Serving provides the same scale-to-zero benefits while staying in the K8s ecosystem:

# knative-service.yaml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: mongodb-mcp
spec:
  template:
    metadata:
      annotations:
        # Scale to zero after 5 minutes of inactivity
        autoscaling.knative.dev/scale-to-zero-grace-period: "5m"
        # Allow up to 100 concurrent requests per pod
        autoscaling.knative.dev/target: "100"
    spec:
      containerConcurrency: 100
      containers:
        - image: your-registry/mongodb-mcp:latest
          ports:
            - containerPort: 3000
          env:
            - name: REDIS_URL
              valueFrom:
                secretKeyRef:
                  name: mcp-secrets
                  key: redis-url
          resources:
            requests:
              memory: "256Mi"
              cpu: "100m"
            limits:
              memory: "512Mi"
              cpu: "500m"

Knative advantages:

Cold start optimization - Knative keeps pods warm based on traffic patterns
Gradual rollouts - Built-in traffic splitting for canary deployments
Scale-to-zero - Pods terminate during inactivity, reducing costs by 60-80%
Kubernetes native - Use existing observability, security, and networking

Architecture for 1000s of Connectors

With scale-to-zero infrastructure, deploying thousands of MCP connectors becomes economically viable:

┌─────────────────────────────────────────────────────────────────┐
│                     Load Balancer / API Gateway                  │
├─────────────────────────────────────────────────────────────────┤
│  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐           │
│  │ MongoDB  │ │ Postgres │ │ Salesforce│ │  SAP     │  ...x1000 │
│  │   MCP    │ │   MCP    │ │    MCP   │ │  MCP     │           │
│  │(scaled=0)│ │(scaled=2)│ │(scaled=0)│ │(scaled=1)│           │
│  └──────────┘ └──────────┘ └──────────┘ └──────────┘           │
├─────────────────────────────────────────────────────────────────┤
│             Shared Redis / DynamoDB (Subscription State)         │
└─────────────────────────────────────────────────────────────────┘

Cost model:

Traditional approach: 1000 connectors × 2 pods × $50/month = $100,000/month
Scale-to-zero approach: Pay only for active connectors = $5,000-15,000/month

Production at Scale: Kaman.ai

At Kaman.ai, we've put this architecture to the test. Our enterprise AI agent platform uses mcp-http-webhook to power data integrations across hundreds of customer deployments.

Pre-Built Connectors

We've already built production-ready MCP connectors for the most common enterprise data sources:

Databases:

PostgreSQL, MySQL, Microsoft SQL Server
MongoDB, DynamoDB
SAP HANA, Oracle
Snowflake, BigQuery, Redshift

File Systems & Storage:

Google Drive, OneDrive, SharePoint
AWS S3, Azure Blob Storage
Dropbox, Box
Local/network file systems (SFTP, SMB)

Business Applications:

Salesforce (Sales Cloud, Service Cloud)
ServiceNow (ITSM, CMDB)
HubSpot, Zendesk
SAP ERP, SAP S/4HANA
Microsoft Dynamics 365

Communication & Collaboration:

Gmail, Outlook/Exchange
Slack, Microsoft Teams
Google Calendar, Outlook Calendar

Developer Tools:

GitHub, GitLab, Bitbucket
Jira, Confluence
Linear, Notion

Enterprise-Grade Features

Every Kaman connector includes:

Multi-tenant isolation - Credentials and data strictly separated per organization
OAuth 2.0 / OIDC - Secure authentication without storing passwords
Incremental sync - Only fetch changed data, reducing API costs
Schema mapping - Transform source schemas to your data lake format
Data lineage - OpenLineage integration for compliance and debugging
Rate limiting - Respect API quotas with intelligent backoff

Real Numbers

Our production deployment handles:

500+ active MCP connectors across customer tenants
10M+ tool invocations per month
99.9% uptime with zero persistent connection overhead
Cold start times under 2 seconds (Knative) or 500ms (Lambda)

The scale-to-zero architecture means customers only pay for what they use—syncing a SharePoint folder once per hour costs a fraction of a cent, not $50/month for an always-on pod.

What's Next

The mcp-http-webhook library is available on npm (package name: mcp-http-webhook). We're using it in production to power:

MongoDB data integration
PostgreSQL connectors
Google Drive sync
Slack channel integrations
And more...

If you're building MCP servers that need to scale, integrate with webhooks, or deploy to Kubernetes - this approach might save you months of engineering effort.

Get Started

npm install mcp-http-webhook

Check out the examples directory for complete implementations, and the Spec.md for detailed API documentation.

Questions? Found a bug? We'd love to hear from you. Open an issue or reach out to our team.

About Surajbhan Satpathy

Surajbhan Satpathy is a driven, forward-thinking tech entrepreneur and the Founder & CEO of Yoctotta — an endeavor rooted in his mission of bridging the gap between technology and business.

With a strong foundation in Java and software development, Surajbhan has demonstrated a commitment to building robust technical solutions. Beyond his technical expertise, he is passionate about education and mentoring: through Yoctotta’s internship initiative (YIP), he has created opportunities for BTech, BE, BSc (IT/CS), BCA and MCA graduates to experience real-world corporate development practices and sharpen their skills.

Surajbhan is also an active thought-leader and content-creator, regularly sharing insights on technology trends — from AI / large-language models to blockchain to software engineering — on his professional network.

With a global outlook grounded in local roots (based in Odisha, India), Surajbhan blends entrepreneurial ambition, technical know-how, and a commitment to nurturing young talent.

Enjoyed this article? Share it with others!

kaman.ai

Blog

95% of GenAI Projects Are Failing – And It’s Not the Model’s Fault

Building Production-Ready MCP Servers That Actually Scale: Our HTTP + Webhook Approach

The Problem with Standard MCP

Our Approach: HTTP + Webhooks

How It Works

The Library: `mcp-http-webhook`

Built on the Official MCP SDK

External State Storage

First-Class Webhook Support

Production Hardened

Real-World Example: MongoDB MCP Server

Architecture Overview

Defining Tools

Defining Resources with Pagination

Change Stream Subscriptions

Credential Discovery

Building Your Own MCP with This Library

1. Install the Library

2. Set Up Basic Server

3. Add Tools

4. Add Resources with Subscriptions

5. Deploy

Key Differentiators

Serverless & Scale-to-Zero Deployment

AWS Lambda Deployment

Kubernetes Knative (Scale-to-Zero)

Architecture for 1000s of Connectors

Production at Scale: Kaman.ai

Pre-Built Connectors

Enterprise-Grade Features

Real Numbers

What's Next

Get Started

Related Articles

The Future of Work Isn’t Human vs AI — It’s Human + AI Teams

Meet Kaman AI: A New Kind of Digital Partner

Building Production-Ready MCP Servers That Actually Scale: Our HTTP + Webhook Approach

The Problem with Standard MCP

Our Approach: HTTP + Webhooks

How It Works

The Library: mcp-http-webhook

Built on the Official MCP SDK

External State Storage

First-Class Webhook Support

Production Hardened

Real-World Example: MongoDB MCP Server

Architecture Overview

Defining Tools

Defining Resources with Pagination

Change Stream Subscriptions

Credential Discovery

Building Your Own MCP with This Library

1. Install the Library

2. Set Up Basic Server

3. Add Tools

4. Add Resources with Subscriptions

5. Deploy

Key Differentiators

Serverless & Scale-to-Zero Deployment

AWS Lambda Deployment

Kubernetes Knative (Scale-to-Zero)

Architecture for 1000s of Connectors

Production at Scale: Kaman.ai

Pre-Built Connectors

Enterprise-Grade Features

Real Numbers

What's Next

Get Started

Related Articles

The Future of Work Isn’t Human vs AI — It’s Human + AI Teams

Meet Kaman AI: A New Kind of Digital Partner

The Library: `mcp-http-webhook`