Sentinel Platform
Overview
Sentinel is a full-stack B2B SaaS fraud-detection platform that accepts image, video, audio, and document uploads through a React dashboard or raw API key, routes each request through an API Gateway that enforces authentication, tiered rate-limiting, and usage quotas, then proxies the payload to one of several GPU-accelerated Python ML services on AWS EC2, aggregates the detection result, persists it to PostgreSQL and AWS S3, and returns a structured confidence score to the caller in real time. Two backend Node.js microservices handle all orchestration and auth; the ML models are deployed and scaled independently on GPU instances.
Description
Sentinel is an enterprise-grade synthetic-media detection API that wraps five independent deep-learning models behind a unified REST gateway. Clients authenticate with a service-specific API key (AES-256-CBC encrypted, stored as a SHA-256 hash) or a JWT bearer token. Each incoming request passes through a middleware stack — API-key validation, JWT verification, tiered rate-limiting, and per-subscription usage-quota enforcement — before the binary payload (image or audio) is forwarded to the appropriate FastAPI ML service over HTTP. The ML service runs PyTorch inference on an NVIDIA T4 GPU and returns a structured JSON result; the gateway logs the full request/response pair to PostgreSQL and stores the raw files under a namespaced S3 key with a presigned URL. A Stripe-backed subscription system gates access to each detection vertical (Deepfake, GenAI, QR/URL) separately, and a React dashboard exposes per-user usage analytics, detection history with refreshed presigned URLs, and a checkout flow. The system is fully stateless at the application tier: all session state lives in the database, all files live in S3, and services are containerised for horizontal scaling.
What it is and what it does
The core request lifecycle has three phases: authentication & quota, detection orchestration, and persistence & response.
In the authentication & quota phase, every request first hits the NGINX reverse proxy (TLS termination, CORS, 30 req/s rate limit) which routes auth-related paths to the Auth Service (port 3001) and everything else to the API Gateway (port 3002). The API Gateway's ApiKeyManager reads the x-api-key header, derives a SHA-256 hash, looks the hash up in the api_keys table, and — if found — decrypts the stored AES-256-CBC value to confirm it matches; on mismatch it falls back to a legacy hashed_api_key field on the user row for backward compatibility. JWT paths go through AuthHandler, which POSTs the token to the Auth Service's /validate-token endpoint and attaches the returned user object to the request. After identity is confirmed, UsageCheckMiddleware queries the subscriptions and usage_records tables via UsageService: it calculates current-period consumption against the tier limit (FREE, BASIC, GEN_AI_DETECTION, DEEPFAKE_DETECTION, URL_QR_DETECTION, ENTERPRISE) and rejects the request with HTTP 429 if the quota is exceeded, or passes it through with the remaining credit count.
In the detection orchestration phase, the relevant controller (e.g. faceSwapDetectionController) calls DetectionOrchestrationService, a TypeDI-injected facade that centralises the five-step detection workflow: (1) upload the raw file buffer to S3 via S3StorageService under a requests/{userId}/{taskId}/ prefix; (2) call the external ML service via the corresponding integration class (e.g. FaceSwapDetectionService) using axios with multipart form-data and a configurable timeout; (3) normalise the ML response — for example the face-swap service returns either a flat {verdict, confidence, faces[]} shape from dev builds or a {results:[{file_name, faces[]}]} shape from production, and the integration layer normalises both; (4) store the JSON response to S3 under responses/{userId}/{taskId}/ via DetectionS3Service; (5) upsert a ServiceRequestLog row in Postgres (with all S3 keys and a 7-day presigned URL) and write a UsageRecord row to decrement the quota. The orchestrator returns an OrchestrationResult containing the task ID, ML response, processing time in milliseconds, S3 URLs, and credits consumed.
In the persistence & response phase, the detection result, S3 keys, and presigned URL are returned to the caller as JSON. Historical queries refresh presigned URLs on the fly by calling S3StorageService.generatePresignedUrl (1-hour TTL) for each row, so frontend calls to /api/history always return accessible links even after the original 7-day URL has expired.
Capabilities
Sentinel detects five categories of synthetic media:
- Face-swap / deepfake detection: submits a JPEG/PNG/MP4 to a PyTorch MoE (Mixture-of-Experts) ensemble model; the ML service returns per-face
blended_fakeness_scorevalues with bounding-box coordinates and a top-levelverdictstring. - AI-generated image detection — sends an image to a CNN + Transformer model that outputs
confidence_scoreand anis_ai_generatedboolean. - AI content & voice detection — analyses audio or mixed-media payloads for signs of voice cloning or synthetic speech.
- Document fraud detection — inspects uploaded IDs or documents for tampering artifacts.
- QR code and URL classification — accepts raw URLs or QR-code images; batch endpoints accept up to ten items and fan the requests out in parallel with
Promise.all.
The subscription system uses Stripe Checkout to gate each vertical independently. On registration, AuthService creates a FREE-tier subscription for all three service categories via PaymentService; upgrades trigger a Stripe Checkout session and a webhook that updates the subscriptions row. UsageService tracks per-period consumption against tier limits defined in PRICING_CONFIG, with separate caps per service category. API keys are scoped per service type: a user can hold one key for FACE_SWAP_DETECTION and a separate key for QR_URL_DETECTION, each with independent rate limits. The RateLimiterService uses express-rate-limit with an in-process MemoryStore today and exposes a Redis-backed RedisStore path that is commented in for production scaling. Winston-based structured logging (via common-utils/logging) emits JSON in production and colourised dev output; each log line carries service, module, correlationId, and userId context, with automatic redaction of passwords, tokens, and email addresses.
Implementation
Auth Service (services/auth-service): AuthService orchestrates four specialised services: TokenService (JWT generation and verification via jsonwebtoken), PasswordService (bcrypt hashing with configurable rounds), EmailVerificationService (time-limited tokens delivered through Resend), and UserService (Prisma CRUD over the users table). Registration creates the user row, sends a verification email, and then calls PaymentService.createFreeSubscription to seed three ACTIVE subscriptions. Login verifies the bcrypt hash and issues a signed JWT carrying userId, email, tier, firstName, and lastName. WorkOS SSO and Auth0 are integrated as alternative identity providers. All business logic is behind TypeDI-injected interfaces (IAuthService, IPaymentService) to enable unit testing with mock implementations.
API Gateway (services/api-gateway): The gateway is an Express application assembled in APIGateway.ts with a layered middleware stack: SecurityMiddleware (Helmet, CORS with an allowlist), RequestLogger, RateLimiterService, ApiKeyManager or AuthHandler, and UsageCheckMiddleware. Routes are grouped into sentinel-defence (detection endpoints) and sentinel-attack (generation endpoints). ApiKeyManager stores keys using AES-256-CBC encryption (IV prepended to ciphertext, iv:ciphertext format) keyed from API_KEY_ENCRYPTION_SECRET, and computes a lookup hash with SHA-256 for constant-time retrieval. DetectionOrchestrationService implements the facade pattern: it holds no ML-specific knowledge and delegates to S3StorageService, DetectionS3Service, UsageService, and RequestHistoryService injected by TypeDI, making each component independently testable.
ML Integration Layer (sentinel-defence/integrations): Each ML service has a TypeScript integration class (FaceSwapDetectionService, AIImageDetectorService, QrUrlDetectionService, etc.) responsible only for constructing the HTTP request (multipart or JSON), handling the service-specific response shape (including normalising old vs new payload schemas), translating HTTP 422 "no face detected" into a typed Error, and surfacing sanitised error messages to callers. Timeouts and retry counts are configurable per service via environment variables and the ServiceConfigProvider.
Database & Storage: Prisma ORM manages a PostgreSQL schema with models for User, ApiKey, Subscription, Invoice, UsageRecord, ServiceRequestLog, DetectionResult, ApiUsageStats, Session, and the full OAuth2 client/token/code triple. ServiceRequestLog carries S3 keys and presigned-URL metadata alongside the full request/response JSON payload, so every detection is fully auditable. DetectionResult rows linked to a ServiceRequestLog capture per-model confidence, detected boolean, modelName, and processingTimeMs. S3 keys follow the structure requests/{userId}/{taskId}/{filename} and responses/{userId}/{taskId}/response.json; presigned URLs use a 7-day default and are refreshed on-demand to 1-hour TTL for history queries.
Frontend (Next.js / React at `app.scam.ai`): The React dashboard provides a detection upload interface per service vertical, a usage analytics page, a detection history feed with live-refreshed S3 image previews, and Stripe-hosted checkout and billing management pages. API calls from the frontend are authenticated with a stored API key passed as the x-api-key header, and error states surface the structured { success, error, message } envelope returned by the gateway.
Demo
No live public demo. You can request the whole product demo through the company website.
Tech & Tools
-Node.js · TypeScript · Express
-Python · PyTorch · FastAPI · CUDA 11.8 · cuDNN 8
-React · Next.js
-PostgreSQL · Prisma ORM
-AWS EC2 (NVIDIA T4 GPU) · AWS S3
-Docker · Docker Compose · NGINX
-Stripe · WorkOS · Auth0 · Resend
-TypeDI · bcrypt · jsonwebtoken · Winston · AES-256-CBC
Highlights
-Stateless microservices (Auth Service + API Gateway) deployable horizontally; all state in PostgreSQL and S3
-AES-256-CBC API key encryption with SHA-256 hash lookup; per-service-type key scoping
-GPU-accelerated ML inference on NVIDIA T4 (PyTorch + CUDA), reducing detection latency from ~4 s (CPU) to ~60 ms
-FastAPI ML wrappers with multi-worker concurrent processing (4 Uvicorn workers per process, bypassing Python GIL) achieving 4× throughput
-Facade-pattern DetectionOrchestrationService centralising S3 upload, ML call, DB logging, and usage deduction across all detection verticals
-Tiered subscription billing via Stripe with independent per-vertical quota enforcement and overage tracking
-Full request/response audit trail in PostgreSQL with S3-backed file storage and automatically refreshed presigned URLs