DISCOVER THE FUTURE OF AI AGENTSarrow_forward

Edge-Veda

calendar_todayAdded Feb 25, 2026
categoryModel & Inference Framework
codeOpen Source
Workflow Automation大语言模型MultimodalRAGAgent FrameworkSDKModel & Inference FrameworkKnowledge Management, Retrieval & RAGModel Training & InferenceComputer Vision & Multimodal

On-device full-stack AI SDK for Flutter with LLM, Vision, Speech, Image Gen, and RAG; features compute budget contracts and adaptive QoS with zero cloud dependency.

Edge-Veda is an on-device AI SDK for Flutter, designed as a managed on-device AI runtime with zero cloud dependency.

Project Positioning#

Addresses privacy concerns, network latency, and high backend costs in mobile AI development. Brings complete AI runtime to edge devices, providing end-to-end capabilities from multimodal inference to speech processing and vector retrieval.

Core Capabilities#

Inference#

  • Text Generation: Streaming/blocking token generation, multi-turn dialog, 42–43 tok/s
  • Vision Inference: VLM model processing camera frames with persistent model loading
  • Image Generation: stable-diffusion.cpp + Metal GPU, 512×512 in ~14s

Speech Processing#

  • STT: whisper.cpp + Metal GPU acceleration, ~670ms/3s chunk
  • TTS: iOS AVSpeechSynthesizer wrapper, zero additional binary size

Advanced Features#

  • Function Calling: ToolDefinition + ToolRegistry, multi-turn tool chains, JSON recovery
  • RAG Pipeline: Built-in pure Dart HNSW VectorIndex + RagPipeline

Runtime Governance#

  • Compute Budget Contract: Declare p95 latency, battery drain, thermal state, memory limits
  • QoS Levels: Full / Reduced / Minimal / Paused adaptive degradation
  • Model Advisor: DeviceProfile hardware detection, ModelAdvisor 4D scoring

Installation#

# pubspec.yaml
dependencies:
  edge_veda: ^2.4.1

iOS minimum version 13.0, XCFramework (~31 MB) auto-downloaded during pod install.

Quick Start#

final edgeVeda = EdgeVeda();
await edgeVeda.init(EdgeVedaConfig(modelPath: modelPath));

// Streaming generation
await for (final chunk in edgeVeda.generateStream('Explain quantum computing')) {
  stdout.write(chunk.token);
}

Architecture#

Flutter App (Dart)
  └── ChatSession / RagPipeline / VectorIndex
  └── EdgeVeda (generate, embed, describeImage)
  └── Workers (StreamingWorker, VisionWorker, WhisperWorker)
  └── Scheduler + EdgeVedaBudget + TelemetryService
  └── FFI Bindings (43 C functions)
       └── XCFramework (llama.cpp, whisper.cpp, stable-diffusion.cpp)

Key design: All inference runs in background isolates, native pointers don't cross boundaries, models persist in memory after loading.

Platform Support#

  • iOS: Metal GPU full support, minimum iOS 13.0
  • macOS: Complete support
  • Android: Skeleton implemented, Vulkan GPU support planned

Code Scale#

~22,700 LOC / 40 C API functions / 32 Dart SDK files

Primary languages: Dart (67.2%), C++ (8.7%), Shell (5.4%), Python (5.3%)

Related Projects

View All arrow_forward

STAY UPDATED

Get the latest AI tools and trends delivered straight to your inbox. No spam, just intelligence.

rocket_launch