Files
OpenNoodl/dev-docs/tasks/phase-11-cloud-functions/cloud-functions-revival-plan.md
Richard Osborne ddcb9cd02e feat: Phase 5 BYOB foundation + Phase 3 GitHub integration
Phase 5 - BYOB Backend (TASK-007A/B):
- LocalSQL Adapter with full CloudStore API compatibility
- QueryBuilder translates Parse-style queries to SQL
- SchemaManager with PostgreSQL/Supabase export
- LocalBackendServer with REST endpoints
- BackendManager with IPC handlers for Electron
- In-memory fallback when better-sqlite3 unavailable

Phase 3 - GitHub Panel (GIT-004):
- Issues tab with list/detail views
- Pull Requests tab with list/detail views
- GitHub API client with OAuth support
- Repository info hook integration

Phase 3 - Editor UX Bugfixes (TASK-013):
- Legacy runtime detection banners
- Read-only enforcement for legacy projects
- Code editor modal close improvements
- Property panel stuck state fix
- Blockly node deletion and UI polish

Phase 11 - Cloud Functions Planning:
- Architecture documentation for workflow automation
- Execution history storage schema design
- Canvas overlay concept for debugging

Docs: Updated LEARNINGS.md and COMMON-ISSUES.md
2026-01-15 17:37:15 +01:00

1348 lines
45 KiB
Markdown

# Cloud Functions Revival: n8n Alternative Vision
**Status:** Planning / Not Yet Started
**Strategic Goal:** Transform Nodegx into a viable workflow automation platform competing with n8n
**Proposed Phase:** 4 (or standalone initiative)
**Total Estimated Effort:** 12-16 weeks
---
## Executive Summary
This document outlines a comprehensive plan to revive and modernize Nodegx's cloud functions system, transforming it from a legacy Parse Server dependency into a powerful, self-hosted workflow automation platform. The vision includes dual-runtime support (JavaScript and Python), execution history, deployment automation, and production monitoring - positioning Nodegx as a serious alternative to tools like n8n, Zapier, and Make.
---
## Current State Analysis
### What Exists Today
#### 1. Basic Cloud Function Infrastructure (Legacy)
**Location:** `packages/noodl-viewer-cloud/`
```
Current Architecture (Parse-dependent):
┌─────────────────────────────────────────┐
│ Editor: Cloud Functions Panel │
│ - Create/edit visual workflows │
│ - Components prefixed /#__cloud__/ │
└─────────────────┬───────────────────────┘
┌─────────────────────────────────────────┐
│ CloudRunner (Runtime) │
│ - Executes visual workflows │
│ - Depends on Parse Server │
│ - Request/Response nodes │
└─────────────────┬───────────────────────┘
┌─────────────────────────────────────────┐
│ Parse Server (External Dependency) │
│ - Database │
│ - Authentication │
│ - Cloud function hosting │
└─────────────────────────────────────────┘
```
**Available Nodes:**
- `Cloud Request` - Entry point for cloud functions
- `Cloud Response` - Exit point with status codes
- `Aggregate` - Database aggregation queries
- Standard data nodes (Query, Create, Update, Delete) - Parse-dependent
**Limitations:**
- ❌ Tightly coupled to Parse Server
- ❌ No local execution during development
- ❌ No execution history or debugging
- ❌ No deployment automation
- ❌ No monitoring or observability
- ❌ No webhook triggers or scheduled tasks
- ❌ No internal event system
- ❌ Cannot run independently of editor
#### 2. In-Progress: Local Backend Integration (TASK-007)
**Status:** Planned but not implemented
**Goal:** Replace Parse dependency with local SQLite + Express server
**Sub-tasks:**
- TASK-007A: LocalSQL Adapter (data layer)
- TASK-007B: Backend Server (Express API + WebSocket)
- TASK-007C: Workflow Runtime (adapting CloudRunner)
- TASK-007D: Schema Management
- TASK-007E: Editor Integration
- TASK-007F: Standalone Deployment (Electron bundling only)
**What This Provides:**
- ✅ Local development without Parse
- ✅ SQLite database
- ✅ Visual workflow execution
- ✅ Database CRUD nodes
- ✅ Basic trigger nodes (Schedule, DB Change, Webhook)
**What's Still Missing:**
- ❌ Production deployment (cloud servers)
- ❌ Execution history
- ❌ Monitoring/observability
- ❌ Webhook endpoint management
- ❌ Advanced trigger types
- ❌ Error handling/retry logic
- ❌ Rate limiting
- ❌ Authentication/authorization
- ❌ Multi-environment support (dev/staging/prod)
#### 3. Deployment Infrastructure (TASK-005 DEPLOY series)
**Status:** Frontend-only deployment automation
**What Exists:**
- GitHub Actions integration
- Deploy to Netlify, Vercel, Cloudflare Pages
- Deploy button in editor
- Environment management
**What's Missing:**
- ❌ Backend/cloud function deployment
- ❌ Docker container deployment to cloud
- ❌ Database migration on deploy
- ❌ Cloud function versioning
- ❌ Rollback capabilities
---
## What's Missing: Gap Analysis
### 1. ❌ Trigger System (n8n equivalent)
**Missing Capabilities:**
| Feature | n8n | Nodegx Current | Nodegx Needed |
|---------|-----|----------------|---------------|
| Webhook triggers | ✅ | ❌ | ✅ |
| Schedule/Cron | ✅ | Planned (TASK-007C) | ✅ |
| Manual triggers | ✅ | ✅ (Request node) | ✅ |
| Database change events | ✅ | Planned (TASK-007C) | ✅ |
| Internal events | ✅ | ❌ | ✅ |
| Queue triggers | ✅ | ❌ | Future |
| File watch | ✅ | ❌ | Future |
| External integrations | ✅ | ❌ | Future Phase |
**Required Nodes:**
```
Trigger Nodes (Priority 1):
├── Webhook Trigger
│ └── Exposes HTTP endpoint
│ └── Captures request data
│ └── Supports authentication
│ └── CORS configuration
├── Schedule Trigger
│ └── Cron expressions
│ └── Interval-based
│ └── Timezone support
├── Manual Trigger
│ └── Test execution button
│ └── Input parameters
└── Internal Event Trigger
└── Event bus subscription
└── Custom event names
└── Event filtering
```
### 2. ❌ Execution History & Debugging
**Missing Capabilities:**
What n8n provides:
- Complete execution log for each workflow run
- Input/output data for every node
- Execution timeline visualization
- Error stack traces
- "Pin" execution data to canvas
- Search/filter execution history
- Export execution data
What Nodegx needs:
```
Execution History System:
┌─────────────────────────────────────────────────────┐
│ Execution Record │
├─────────────────────────────────────────────────────┤
│ - ID: exec_abc123xyz │
│ - Workflow: /#__cloud__/ProcessOrder │
│ - Trigger: webhook_payment_received │
│ - Started: 2025-01-15 14:23:45 │
│ - Duration: 1.2s │
│ - Status: Success / Error / Running │
│ - Input Data: { orderId: 12345, ... } │
│ │
│ Node Execution Steps: │
│ ├─ [Request] ─────────────── 0ms ✓ │
│ │ Input: { orderId: 12345 } │
│ │ Output: { orderId: 12345, userId: 789 } │
│ │ │
│ ├─ [Query DB] ────────────── 45ms ✓ │
│ │ Input: { userId: 789 } │
│ │ Output: { user: {...}, orders: [...] } │
│ │ │
│ ├─ [HTTP Request] ───────── 890ms ✓ │
│ │ Input: { endpoint: '/api/charge', ... } │
│ │ Output: { success: true, transactionId: ... } │
│ │ │
│ └─ [Response] ────────────── 5ms ✓ │
│ Input: { statusCode: 200, ... } │
│ Output: { statusCode: 200, body: {...} } │
└─────────────────────────────────────────────────────┘
```
**Implementation Requirements:**
- Persistent storage (SQLite or separate DB)
- Efficient querying (indexes on workflow, status, timestamp)
- Data retention policies
- Privacy controls (PII redaction)
- Canvas overlay UI to show pinned execution
- Timeline visualization component
### 3. ❌ Production Deployment System
**Missing Infrastructure:**
Current deployment stops at frontend. Cloud functions need:
```
Required Deployment Architecture:
┌─────────────────────────────────────────────────────┐
│ Local Development │
│ ├─ Editor (with cloud functions panel) │
│ ├─ Local Backend Server (SQLite + Express) │
│ └─ Hot-reload on changes │
└─────────────────┬───────────────────────────────────┘
│ Deploy Command
┌─────────────────────────────────────────────────────┐
│ Build & Package │
│ ├─ Compile workflows to optimized format │
│ ├─ Bundle dependencies │
│ ├─ Generate Dockerfile │
│ ├─ Create docker-compose.yml │
│ └─ Package database schema + migrations │
└─────────────────┬───────────────────────────────────┘
│ Push to Registry
┌─────────────────────────────────────────────────────┐
│ Container Registry │
│ ├─ Docker Hub │
│ ├─ GitHub Container Registry │
│ └─ AWS ECR / Google GCR │
└─────────────────┬───────────────────────────────────┘
│ Deploy to Platform
┌─────────────────────────────────────────────────────┐
│ Cloud Hosting Options │
│ ├─ Fly.io (easiest, auto-scaling) │
│ ├─ Railway (developer-friendly) │
│ ├─ Render (simple, affordable) │
│ ├─ DigitalOcean App Platform │
│ ├─ AWS ECS / Fargate │
│ ├─ Google Cloud Run │
│ └─ Self-hosted VPS (Docker Compose) │
└─────────────────────────────────────────────────────┘
```
**Deployment Providers to Support:**
Priority 1 (Simple PaaS):
- **Fly.io** - Best for this use case (auto-scaling, global, simple)
- **Railway** - Developer favorite, easy setup
- **Render** - Affordable, straightforward
Priority 2 (Traditional Cloud):
- **AWS** (ECS/Fargate + RDS)
- **Google Cloud** (Cloud Run + Cloud SQL)
- **DigitalOcean** (App Platform + Managed DB)
Priority 3 (Self-hosted):
- **Docker Compose** templates for VPS deployment
- **Kubernetes** manifests (advanced users)
**Required Features:**
- One-click deploy from editor
- Environment variable management
- Database migration handling
- SSL/TLS certificate automation
- Domain/subdomain configuration
- Health checks and auto-restart
- Log streaming to editor
- Blue-green or rolling deployments
- Rollback capability
### 4. ❌ Monitoring & Observability
**Missing Dashboards:**
```
Required Monitoring Views:
┌─────────────────────────────────────────────────────┐
│ Workflow Monitoring Dashboard │
├─────────────────────────────────────────────────────┤
│ │
│ Active Workflows: │
│ ┌───────────────────────────────────────────────┐ │
│ │ ProcessOrder ● Running │ │
│ │ └─ Requests: 1,234 (24h) │ │
│ │ └─ Success: 98.5% │ │
│ │ └─ Avg Response: 450ms │ │
│ │ └─ Errors: 18 (last 24h) │ │
│ │ │ │
│ │ SendWelcomeEmail ● Running │ │
│ │ └─ Requests: 456 (24h) │ │
│ │ └─ Success: 100% │ │
│ │ └─ Avg Response: 1.2s │ │
│ │ │ │
│ │ GenerateReport ⏸ Paused │ │
│ │ └─ Last run: 2 hours ago │ │
│ └───────────────────────────────────────────────┘ │
│ │
│ Performance Metrics (Last 24h): │
│ ┌───────────────────────────────────────────────┐ │
│ │ Total Executions: 1,690 │ │
│ │ Success Rate: 98.9% │ │
│ │ Avg Duration: 680ms │ │
│ │ P95 Duration: 2.1s │ │
│ │ P99 Duration: 5.8s │ │
│ │ Total Errors: 18 │ │
│ └───────────────────────────────────────────────┘ │
│ │
│ Recent Errors: │
│ ┌───────────────────────────────────────────────┐ │
│ │ 14:23 ProcessOrder: Database timeout │ │
│ │ 13:45 ProcessOrder: Invalid JSON in request │ │
│ │ 12:10 ProcessOrder: HTTP 500 from Stripe API │ │
│ └───────────────────────────────────────────────┘ │
│ │
│ [View All Executions] [Export Logs] │
└─────────────────────────────────────────────────────┘
```
**Metrics to Track:**
- Execution count (by workflow, by time period)
- Success/error rates
- Response time percentiles (P50, P95, P99)
- Error types and frequency
- Resource usage (CPU, memory, disk)
- Active webhook endpoints
- Scheduled job status
- Queue depth (if implementing queues)
**Alerting System:**
- Email notifications on errors
- Webhook notifications
- Threshold alerts (e.g., error rate > 5%)
- Slack integration (future)
### 5. ❌ Advanced Workflow Features
**Missing Flow Control:**
n8n provides:
- IF/ELSE conditions
- Switch nodes (multiple branches)
- Loop nodes (iterate over arrays)
- Error handling nodes
- Merge nodes (combine branches)
- Split nodes (parallel execution)
- Wait/Delay nodes
- Code nodes (custom JavaScript/Python)
Nodegx currently has:
- Basic signal flow
- Limited logic nodes
**Required Logic Nodes:**
```
Control Flow Nodes:
├── IF Condition
│ └── Supports complex expressions
│ └── Multiple condition groups (AND/OR)
│ └── True/False branches
├── Switch
│ └── Multiple case branches
│ └── Default case
│ └── Expression-based routing
├── For Each
│ └── Iterate over arrays
│ └── Access item and index
│ └── Batch size control
├── Merge
│ └── Wait for all branches
│ └── Wait for any branch
│ └── Combine outputs
├── Error Handler
│ └── Try/catch equivalent
│ └── Retry logic
│ └── Fallback behavior
└── Wait/Delay
└── Configurable duration
└── Wait for webhook
└── Wait for condition
```
**Required Data Nodes:**
```
Data Manipulation Nodes:
├── Set Variable
│ └── Create/update variables
│ └── Expression support
├── Transform
│ └── Map/filter/reduce arrays
│ └── Object manipulation
│ └── JSON path queries
├── HTTP Request
│ └── All HTTP methods
│ └── Authentication support
│ └── Request/response transformation
├── Code (JavaScript)
│ └── Custom logic
│ └── Access to all inputs
│ └── Return multiple outputs
├── Code (Python) ← NEW
│ └── For AI/ML workflows
│ └── Access to Python ecosystem
│ └── Async/await support
└── JSON Parser
└── Parse/stringify
└── Validate schema
└── Extract values
```
---
## Proposed Implementation: The "Cloud Functions Revival" Phase
### Phase Structure
**Suggested Placement:** Between Phase 3 and Phase 5, or as Phase 4
**Total Timeline:** 12-16 weeks (3-4 months)
**Team Size:** 1-2 developers + 1 designer (for UI components)
---
## SERIES 1: Core Workflow Runtime (4 weeks)
Building on TASK-007C, complete the workflow execution system.
### WORKFLOW-001: Advanced Trigger System (1 week)
**Implement:**
- Webhook trigger nodes with URL management
- Enhanced schedule nodes with cron expressions
- Internal event trigger system
- Manual execution triggers with parameters
**Files to Create:**
```
packages/noodl-viewer-cloud/src/nodes/triggers/
├── webhook.ts
├── schedule.ts
├── internal-event.ts
└── manual.ts
packages/noodl-runtime/src/nodes/std-library/workflow-triggers/
├── webhook-trigger.js
├── schedule-trigger.js
└── event-trigger.js
```
**Key Features:**
- Webhook URL generation and management
- Request authentication (API keys, JWT)
- Cron expression editor with human-readable preview
- Event bus for internal triggers
- Test execution with sample data
### WORKFLOW-002: Logic & Control Flow Nodes (1.5 weeks)
**Implement:**
- IF/ELSE condition nodes
- Switch nodes (multi-branch)
- For Each loop nodes
- Merge/Split nodes
- Error handling nodes
- Wait/Delay nodes
**Files to Create:**
```
packages/noodl-runtime/src/nodes/std-library/workflow-logic/
├── if-condition.js
├── switch.js
├── for-each.js
├── merge.js
├── error-handler.js
└── wait.js
```
**Key Features:**
- Visual expression builder
- Complex condition support (AND/OR groups)
- Parallel execution where appropriate
- Automatic error propagation
- Loop iteration controls
### WORKFLOW-003: Data Manipulation Nodes (1 week)
**Implement:**
- Enhanced HTTP Request node
- JSON Parser/Stringifier
- Transform node (map/filter/reduce)
- Set Variable node
- Code nodes (JavaScript, preparation for Python)
**Files to Create:**
```
packages/noodl-runtime/src/nodes/std-library/workflow-data/
├── http-request-advanced.js
├── json-parser.js
├── transform.js
├── set-variable.js
└── code-javascript.js
```
**Key Features:**
- HTTP request builder UI
- JSONPath and JMESPath support
- Visual data transformation builder
- Variable scope management
- Monaco editor for code nodes
### WORKFLOW-004: Error Handling & Retry Logic (0.5 weeks)
**Implement:**
- Automatic retry with exponential backoff
- Dead letter queue for failed executions
- Error categorization (retriable vs. fatal)
- Global error handlers
**Files to Modify:**
```
packages/noodl-viewer-cloud/src/LocalCloudRunner.ts
packages/noodl-runtime/src/nodes/std-library/workflow-logic/error-handler.js
```
---
## SERIES 2: Execution History & Debugging (3 weeks)
### HISTORY-001: Execution Storage System (1 week)
**Implement:**
- SQLite table schema for executions
- Efficient storage of execution data
- Data retention policies
- Query APIs for execution retrieval
**Database Schema:**
```sql
CREATE TABLE workflow_executions (
id TEXT PRIMARY KEY,
workflow_id TEXT NOT NULL,
workflow_name TEXT NOT NULL,
trigger_type TEXT NOT NULL,
trigger_data TEXT, -- JSON
status TEXT NOT NULL, -- running, success, error
started_at INTEGER NOT NULL,
completed_at INTEGER,
duration_ms INTEGER,
error_message TEXT,
error_stack TEXT,
FOREIGN KEY (workflow_id) REFERENCES components(id)
);
CREATE TABLE execution_steps (
id TEXT PRIMARY KEY,
execution_id TEXT NOT NULL,
node_id TEXT NOT NULL,
node_name TEXT NOT NULL,
step_index INTEGER NOT NULL,
started_at INTEGER NOT NULL,
completed_at INTEGER,
duration_ms INTEGER,
status TEXT NOT NULL,
input_data TEXT, -- JSON
output_data TEXT, -- JSON
error_message TEXT,
FOREIGN KEY (execution_id) REFERENCES workflow_executions(id)
);
CREATE INDEX idx_executions_workflow ON workflow_executions(workflow_id);
CREATE INDEX idx_executions_status ON workflow_executions(status);
CREATE INDEX idx_executions_started ON workflow_executions(started_at);
CREATE INDEX idx_steps_execution ON execution_steps(execution_id);
```
**Files to Create:**
```
packages/noodl-viewer-cloud/src/execution-history/
├── ExecutionStore.ts
├── ExecutionLogger.ts
└── RetentionManager.ts
```
### HISTORY-002: Execution Logger Integration (0.5 weeks)
**Implement:**
- Hook into CloudRunner to log all execution steps
- Capture input/output for each node
- Track timing and performance
- Handle large data (truncation, compression)
**Files to Modify:**
```
packages/noodl-viewer-cloud/src/LocalCloudRunner.ts
```
**Key Features:**
- Minimal performance overhead
- Configurable data capture (full vs. minimal)
- Automatic PII redaction options
- Compression for large payloads
### HISTORY-003: Execution History UI (1 week)
**Implement:**
- Execution list panel
- Search and filter controls
- Execution detail view
- Timeline visualization
**Files to Create:**
```
packages/noodl-editor/src/editor/src/views/ExecutionHistory/
├── ExecutionHistoryPanel.tsx
├── ExecutionList.tsx
├── ExecutionDetail.tsx
├── ExecutionTimeline.tsx
└── ExecutionHistoryPanel.module.scss
```
**UI Components:**
- Filterable list (by workflow, status, date range)
- Execution timeline with node-by-node breakdown
- Expandable step details (input/output viewer)
- Search across all execution data
- Export to JSON/CSV
### HISTORY-004: Canvas Execution Overlay (0.5 weeks)
**Implement:**
- "Pin execution" feature
- Overlay execution data on canvas
- Show data flow between nodes
- Highlight error paths
**Files to Create:**
```
packages/noodl-editor/src/editor/src/views/nodeGraph/
├── ExecutionOverlay.tsx
├── NodeExecutionBadge.tsx
└── ConnectionDataFlow.tsx
```
**Key Features:**
- Click execution in history to pin to canvas
- Show input/output data on hover
- Animate data flow (optional)
- Highlight nodes that errored
- Time scrubbing through execution
---
## SERIES 3: Production Deployment (3 weeks)
### DEPLOY-CLOUD-001: Container Build System (1 week)
**Implement:**
- Dockerfile generator for workflows
- docker-compose template
- Environment variable management
- Database initialization scripts
**Files to Create:**
```
packages/noodl-editor/src/editor/src/services/deployment/
├── ContainerBuilder.ts
├── templates/
│ ├── Dockerfile.template
│ ├── docker-compose.yml.template
│ └── entrypoint.sh.template
└── DatabaseMigrationGenerator.ts
```
**Generated Dockerfile Example:**
```dockerfile
FROM node:18-alpine
WORKDIR /app
# Copy workflow runtime
COPY packages/noodl-viewer-cloud /app/runtime
COPY packages/noodl-runtime /app/noodl-runtime
# Copy project workflows
COPY .noodl/backend-*/workflows /app/workflows
COPY .noodl/backend-*/schema.json /app/schema.json
# Install dependencies
RUN npm ci --production
# Health check
HEALTHCHECK --interval=30s --timeout=3s \
CMD node healthcheck.js || exit 1
# Expose port
EXPOSE 8080
# Start server
CMD ["node", "runtime/dist/server.js"]
```
### DEPLOY-CLOUD-002: Platform Integrations (1.5 weeks)
**Implement:**
- Fly.io deployment provider
- Railway deployment provider
- Render deployment provider
- Generic Docker registry support
**Files to Create:**
```
packages/noodl-editor/src/editor/src/services/deployment/providers/
├── FlyProvider.ts
├── RailwayProvider.ts
├── RenderProvider.ts
└── GenericDockerProvider.ts
packages/noodl-editor/src/editor/src/views/deployment/
├── CloudDeployPanel.tsx
├── PlatformSelector.tsx
├── EnvironmentConfig.tsx
└── DeploymentStatus.tsx
```
**Key Features:**
- OAuth or API key authentication
- Automatic SSL/TLS setup
- Environment variable UI
- Database provisioning (where supported)
- Domain configuration
- Deployment logs streaming
### DEPLOY-CLOUD-003: Deploy UI & Workflow (0.5 weeks)
**Implement:**
- "Deploy to Cloud" button
- Platform selection wizard
- Configuration validation
- Deployment progress tracking
- Rollback functionality
**Integration Points:**
- Add to EditorTopbar
- Add to Backend Services Panel
- Link from Workflow Monitoring Dashboard
---
## SERIES 4: Monitoring & Observability (2 weeks)
### MONITOR-001: Metrics Collection (0.5 weeks)
**Implement:**
- Execution metrics aggregation
- Time-series data storage
- Real-time metric updates via WebSocket
**Database Schema:**
```sql
CREATE TABLE workflow_metrics (
id TEXT PRIMARY KEY,
workflow_id TEXT NOT NULL,
date TEXT NOT NULL, -- YYYY-MM-DD
hour INTEGER NOT NULL, -- 0-23
execution_count INTEGER DEFAULT 0,
success_count INTEGER DEFAULT 0,
error_count INTEGER DEFAULT 0,
total_duration_ms INTEGER DEFAULT 0,
avg_duration_ms INTEGER DEFAULT 0,
p95_duration_ms INTEGER DEFAULT 0,
p99_duration_ms INTEGER DEFAULT 0,
UNIQUE(workflow_id, date, hour)
);
CREATE INDEX idx_metrics_workflow ON workflow_metrics(workflow_id);
CREATE INDEX idx_metrics_date ON workflow_metrics(date);
```
**Files to Create:**
```
packages/noodl-viewer-cloud/src/monitoring/
├── MetricsCollector.ts
├── MetricsAggregator.ts
└── MetricsStore.ts
```
### MONITOR-002: Monitoring Dashboard (1 week)
**Implement:**
- Workflow status overview
- Performance metrics charts
- Error log viewer
- Real-time execution feed
**Files to Create:**
```
packages/noodl-editor/src/editor/src/views/WorkflowMonitoring/
├── MonitoringDashboard.tsx
├── WorkflowStatusCard.tsx
├── PerformanceChart.tsx
├── ErrorLogViewer.tsx
└── RealtimeExecutionFeed.tsx
```
**Chart Libraries:**
- Use Recharts (already used in Nodegx)
- Line charts for execution trends
- Bar charts for error rates
- Heatmaps for hourly patterns
### MONITOR-003: Alerting System (0.5 weeks)
**Implement:**
- Alert configuration UI
- Email notifications
- Webhook notifications
- Alert history
**Files to Create:**
```
packages/noodl-viewer-cloud/src/monitoring/
├── AlertManager.ts
├── AlertEvaluator.ts
└── NotificationSender.ts
packages/noodl-editor/src/editor/src/views/WorkflowMonitoring/
└── AlertConfigPanel.tsx
```
**Alert Types:**
- Error rate threshold
- Execution failure
- Response time threshold
- Workflow didn't execute (schedule check)
---
## BONUS: Python Runtime for AI Workflows (4 weeks)
This is the game-changer for AI agent development.
### PYTHON-001: Architecture & Runtime Bridge (1 week)
**Design Decision:**
Instead of running Python in Node.js, create a **parallel Python runtime** that communicates with the Node.js server via HTTP/gRPC:
```
┌─────────────────────────────────────────────────────┐
│ Node.js Backend Server (Port 8080) │
│ ├─ Express API │
│ ├─ WebSocket server │
│ ├─ JavaScript CloudRunner │
│ └─ Python Runtime Proxy │
└─────────────────┬───────────────────────────────────┘
│ HTTP/gRPC calls
┌─────────────────────────────────────────────────────┐
│ Python Runtime Server (Port 8081) │
│ ├─ FastAPI/Flask │
│ ├─ Python CloudRunner │
│ ├─ Workflow Executor │
│ └─ AI Integration Layer │
│ ├─ LangGraph support │
│ ├─ LangChain support │
│ ├─ Anthropic SDK │
│ └─ OpenAI SDK │
└─────────────────────────────────────────────────────┘
```
**Why This Approach:**
- Native Python execution (no PyNode.js hacks)
- Access to full Python ecosystem
- Better performance for AI workloads
- Easier debugging
- Independent scaling
**Files to Create:**
```
packages/noodl-python-runtime/
├── server.py # FastAPI server
├── runner.py # Python CloudRunner
├── executor.py # Workflow executor
├── nodes/ # Python node implementations
│ ├── triggers/
│ ├── ai/
│ ├── logic/
│ └── data/
└── requirements.txt
packages/noodl-viewer-cloud/src/python/
└── PythonRuntimeProxy.ts # Node.js → Python bridge
```
### PYTHON-002: Core Python Nodes (1 week)
**Implement:**
- Python Code node (custom logic)
- IF/ELSE/Switch (Python expressions)
- For Each (Python iteration)
- Transform (Python lambdas)
- HTTP Request (using `requests` or `httpx`)
**Node Definition Format:**
Keep the same JSON format but with Python execution:
```python
# packages/noodl-python-runtime/nodes/logic/if_condition.py
from typing import Dict, Any
from runtime.node import Node, NodeInput, NodeOutput, Signal
class IfConditionNode(Node):
"""Python IF condition node"""
name = "python.logic.if"
display_name = "IF Condition"
category = "Logic"
inputs = [
NodeInput("condition", "boolean", display_name="Condition"),
NodeInput("trigger", "signal", display_name="Evaluate"),
]
outputs = [
NodeOutput("true", "signal", display_name="True"),
NodeOutput("false", "signal", display_name="False"),
]
async def execute(self, inputs: Dict[str, Any]) -> Dict[str, Any]:
condition = inputs.get("condition", False)
if condition:
return {"true": Signal()}
else:
return {"false": Signal()}
```
### PYTHON-003: AI/LLM Integration Nodes (1.5 weeks)
**Implement:**
- Claude API node (Anthropic SDK)
- OpenAI API node
- LangChain Agent node
- LangGraph Workflow node
- Vector Store Query node (Pinecone, Qdrant, etc.)
- Embedding Generation node
**Files to Create:**
```
packages/noodl-python-runtime/nodes/ai/
├── claude_completion.py
├── openai_completion.py
├── langchain_agent.py
├── langgraph_workflow.py
├── vector_store_query.py
├── generate_embeddings.py
└── prompt_template.py
```
**Example: Claude API Node**
```python
# packages/noodl-python-runtime/nodes/ai/claude_completion.py
from typing import Dict, Any
from runtime.node import Node, NodeInput, NodeOutput
import anthropic
import os
class ClaudeCompletionNode(Node):
"""Claude API completion node"""
name = "python.ai.claude"
display_name = "Claude Completion"
category = "AI"
inputs = [
NodeInput("prompt", "string", display_name="Prompt"),
NodeInput("system", "string", display_name="System Prompt", optional=True),
NodeInput("model", "string", display_name="Model",
default="claude-sonnet-4-20250514"),
NodeInput("max_tokens", "number", display_name="Max Tokens", default=1024),
NodeInput("temperature", "number", display_name="Temperature", default=1.0),
NodeInput("api_key", "string", display_name="API Key",
optional=True, secret=True),
NodeInput("execute", "signal", display_name="Execute"),
]
outputs = [
NodeOutput("response", "string", display_name="Response"),
NodeOutput("usage", "object", display_name="Usage Stats"),
NodeOutput("done", "signal", display_name="Done"),
NodeOutput("error", "string", display_name="Error"),
]
async def execute(self, inputs: Dict[str, Any]) -> Dict[str, Any]:
try:
api_key = inputs.get("api_key") or os.getenv("ANTHROPIC_API_KEY")
if not api_key:
raise ValueError("ANTHROPIC_API_KEY not configured")
client = anthropic.Anthropic(api_key=api_key)
message = client.messages.create(
model=inputs.get("model"),
max_tokens=inputs.get("max_tokens"),
temperature=inputs.get("temperature"),
system=inputs.get("system", ""),
messages=[
{"role": "user", "content": inputs.get("prompt")}
]
)
return {
"response": message.content[0].text,
"usage": {
"input_tokens": message.usage.input_tokens,
"output_tokens": message.usage.output_tokens,
},
"done": Signal()
}
except Exception as e:
return {
"error": str(e),
}
```
**Example: LangGraph Agent Node**
```python
# packages/noodl-python-runtime/nodes/ai/langgraph_workflow.py
from typing import Dict, Any
from runtime.node import Node, NodeInput, NodeOutput
from langgraph.graph import StateGraph, END
from langchain_anthropic import ChatAnthropic
import json
class LangGraphWorkflowNode(Node):
"""LangGraph multi-agent workflow"""
name = "python.ai.langgraph"
display_name = "LangGraph Workflow"
category = "AI"
inputs = [
NodeInput("workflow_definition", "object",
display_name="Workflow Definition"),
NodeInput("input_data", "object", display_name="Input Data"),
NodeInput("execute", "signal", display_name="Execute"),
]
outputs = [
NodeOutput("result", "object", display_name="Result"),
NodeOutput("state_history", "array", display_name="State History"),
NodeOutput("done", "signal", display_name="Done"),
NodeOutput("error", "string", display_name="Error"),
]
async def execute(self, inputs: Dict[str, Any]) -> Dict[str, Any]:
try:
workflow_def = inputs.get("workflow_definition")
input_data = inputs.get("input_data", {})
# Build LangGraph workflow from definition
graph = self._build_graph(workflow_def)
# Execute
result = await graph.ainvoke(input_data)
return {
"result": result,
"state_history": result.get("_history", []),
"done": Signal()
}
except Exception as e:
return {"error": str(e)}
def _build_graph(self, definition: Dict) -> StateGraph:
# Implementation to build LangGraph from Nodegx definition
# This allows visual design of LangGraph workflows!
pass
```
### PYTHON-004: Language Toggle & Node Registry (0.5 weeks)
**Implement:**
- Workflow language selector (JavaScript vs. Python)
- Node palette filtering based on language
- Validation to prevent mixing languages
- Migration helpers (JS → Python)
**Files to Create:**
```
packages/noodl-editor/src/editor/src/views/WorkflowLanguageSelector.tsx
packages/noodl-runtime/src/nodes/node-registry.ts
```
**UI Changes:**
Add language selector to Cloud Functions panel:
```
┌─────────────────────────────────────────────┐
│ Cloud Functions [+] │
├─────────────────────────────────────────────┤
│ Runtime Language: ○ JavaScript ● Python │
├─────────────────────────────────────────────┤
│ 📁 /#__cloud__/ │
│ ├─ ProcessOrder (JS) │
│ ├─ GenerateReport (JS) │
│ └─ ChatAssistant (Python) 🐍 │
└─────────────────────────────────────────────┘
```
**Node Palette Changes:**
```
When JavaScript selected:
├── HTTP Request (JS)
├── Code (JavaScript)
├── Transform (JS)
└── ...
When Python selected:
├── HTTP Request (Python)
├── Code (Python)
├── Transform (Python)
├── Claude Completion 🤖
├── OpenAI Completion 🤖
├── LangGraph Agent 🤖
├── Vector Store Query 🤖
└── ...
```
---
## Success Metrics
How we'll know this phase was successful:
### Functional Completeness
- [ ] Can create webhook endpoints that respond to HTTP requests
- [ ] Can schedule workflows with cron expressions
- [ ] Can view complete execution history with node-by-node data
- [ ] Can deploy workflows to production cloud (Fly.io, Railway, or Render)
- [ ] Can monitor workflow performance and errors in real-time
- [ ] Can create Python workflows for AI use cases
- [ ] Can use Claude/OpenAI APIs in visual workflows
### User Experience
- [ ] Creating a webhook workflow takes < 5 minutes
- [ ] Debugging failed workflows takes < 2 minutes (using execution history)
- [ ] Deploying to production takes < 3 minutes
- [ ] Setting up AI chat assistant takes < 10 minutes
- [ ] No documentation needed for basic workflows (intuitive)
### Technical Performance
- [ ] Workflow execution overhead < 50ms
- [ ] Execution history queries < 100ms
- [ ] Real-time monitoring updates < 1 second latency
- [ ] Python runtime performance within 20% of JavaScript
- [ ] Can handle 1000 concurrent workflow executions
### Competitive Position
- [ ] Feature parity with n8n core features (triggers, monitoring, deployment)
- [ ] Better UX than n8n (visual consistency, execution debugging)
- [ ] Unique advantages: AI-first Python runtime, integrated with Nodegx frontend
---
## Risk Assessment
### High Risks
1. **Python Runtime Complexity** ⚠️⚠️⚠️
- Two separate runtimes to maintain
- Language interop challenges
- Deployment complexity increases
- **Mitigation:** Start with JavaScript-only, add Python in Phase 2
2. **Deployment Platform Variability** ⚠️⚠️
- Each platform has different constraints
- Difficult to test all scenarios
- User environment issues
- **Mitigation:** Focus on 2-3 platforms initially (Fly.io, Railway)
3. **Execution History Storage Growth** ⚠️⚠️
- Could fill disk quickly with large workflows
- Privacy concerns with stored data
- Query performance degradation
- **Mitigation:** Implement retention policies, data compression, pagination
### Medium Risks
4. **Monitoring Performance Impact** ⚠️
- Metrics collection could slow workflows
- WebSocket connections scale issues
- **Mitigation:** Async metrics, batching, optional detailed logging
5. **Migration from Parse** ⚠️
- Users with existing Parse-based workflows
- No clear migration path
- **Mitigation:** Keep Parse adapter working, provide migration wizard
### Low Risks
6. **UI Complexity** ⚠️
- Many new panels and views
- Risk of overwhelming users
- **Mitigation:** Progressive disclosure, onboarding wizard
---
## Open Questions
1. **Database Choice for Production**
- SQLite is fine for single-server deployments
- What about multi-region, high-availability?
- Should we support PostgreSQL/MySQL for production?
2. **Python Runtime Packaging**
- How do we handle Python dependencies?
- Should users provide requirements.txt?
- Do we use virtual environments?
- What about native extensions (requires compilation)?
3. **AI Node Pricing**
- Claude/OpenAI nodes require API keys
- Do we provide pooled API access with credits?
- Or user brings own keys only?
4. **Workflow Versioning**
- Should we track workflow versions?
- Enable rollback to previous versions?
- How does this interact with Git?
5. **Multi-User Collaboration**
- What if multiple people deploy the same workflow?
- How to handle concurrent edits?
- Environment separation (dev/staging/prod per user)?
---
## Next Steps
### Immediate Actions
1. **Validate Vision** - Review this document with stakeholders
2. **Prioritize Features** - Which series should we start with?
3. **Prototype Key Risks** - Build proof-of-concept for Python runtime
4. **Design Review** - UI/UX review for new panels and workflows
5. **Resource Allocation** - Assign developers and timeline
### Phased Rollout Recommendation
**Phase 1 (MVP):** Series 1 + Series 2
- Core workflow runtime with triggers and logic nodes
- Execution history and debugging
- **Goal:** Internal dogfooding, validate architecture
- **Timeline:** 7 weeks
**Phase 2 (Beta):** Series 3
- Production deployment to Fly.io
- Basic monitoring
- **Goal:** Early access users, prove deployment works
- **Timeline:** 3 weeks
**Phase 3 (v1.0):** Series 4
- Complete monitoring and alerting
- Polish and bug fixes
- **Goal:** Public release, compare with n8n
- **Timeline:** 2 weeks
**Phase 4 (v2.0):** Bonus - Python Runtime
- Python workflow support
- AI/LLM nodes
- **Goal:** Differentiation, AI use case enablement
- **Timeline:** 4 weeks
---
## Appendix: Competitive Analysis
### n8n Feature Comparison
| Feature | n8n | Nodegx Current | Nodegx After Phase |
|---------|-----|----------------|--------------------|
| Visual workflow editor | ✅ | ✅ | ✅ |
| Webhook triggers | ✅ | ❌ | ✅ |
| Schedule triggers | ✅ | ❌ | ✅ |
| Execution history | ✅ | ❌ | ✅ |
| Error handling | ✅ | ⚠️ Basic | ✅ |
| Monitoring dashboard | ✅ | ❌ | ✅ |
| Self-hosting | ✅ | ⚠️ Local only | ✅ |
| Cloud deployment | ✅ | ❌ | ✅ |
| Custom code nodes | ✅ | ⚠️ Limited | ✅ |
| **Python runtime** | ❌ | ❌ | ✅ ⭐ |
| **AI/LLM nodes** | ⚠️ Basic | ❌ | ✅ ⭐ |
| **Integrated frontend** | ❌ | ✅ | ✅ ⭐ |
| **Visual debugging** | ⚠️ Limited | ❌ | ✅ ⭐ |
**Nodegx Advantages After This Phase:**
- ⭐ Native Python runtime for AI workflows
- ⭐ Integrated with visual frontend development
- ⭐ Better execution debugging (pin to canvas)
- ⭐ Single tool for full-stack development
- ⭐ AI-first node library
**n8n Advantages:**
- Mature ecosystem (400+ integrations)
- Established community
- Extensive documentation
- Battle-tested at scale
- Enterprise features (SSO, RBAC, etc.)
---
## Conclusion
This "Cloud Functions Revival" phase would transform Nodegx from a frontend-focused tool into a true full-stack development platform. The combination of visual workflow design, execution history, production deployment, and especially the Python runtime for AI puts Nodegx in a unique position:
**"The only visual development platform where you can design your frontend, build your backend logic, create AI agents, and deploy everything to production - all without leaving the canvas."**
The total investment is significant (12-16 weeks) but positions Nodegx to compete directly with n8n while offering unique differentiation through:
1. Integrated frontend development
2. Python runtime for AI use cases
3. Superior debugging experience
4. Modern, consistent UI
This could be the feature set that makes Nodegx indispensable for full-stack developers and AI engineers.