llmfit
✨A Rust-based cross-platform CLI tool that right-sizes LLM models to your system's RAM, CPU, and GPU by detecting specs and recommending optimal models and quantization strategies. Covers 206 models from 57 providers.
A Rust-based cross-platform CLI tool that right-sizes LLM models to your system's RAM, CPU, and GPU by detecting specs and recommending optimal models and quantization strategies. Covers 206 models from 57 providers.
A curated list of free LLM inference APIs, covering rate limits, model lists, and special requirements for major platforms like OpenRouter, Google AI Studio, Groq, and Cerebras. Ideal for developers in the prototyping phase.
A minimal, hackable experimental harness for training LLMs on a single GPU node, covering all stages from pretraining to a ChatGPT-like UI.
An open-source framework by Stream for building vision AI agents that work with any model or video provider, leveraging Stream's edge network for ultra-low latency video experiences.
AirLLM optimizes inference memory usage, enabling 70B large language models to run on a single 4GB GPU card without quantization, distillation, or pruning. It now also supports running 405B Llama3.1 models on 8GB VRAM.
A modern AI gateway system that provides a unified OpenAI, Anthropic, Gemini and AI SDK compatible API, enabling seamless integration across multiple AI providers with automatic request translation and comprehensive tracing capabilities.
A curated collection of papers on building and evaluating language model agents via executable language grounding, covering LLM code generation, agents with tool use, web grounding, and robotics research.
Blades is a multimodal AI Agent framework for the Go language, supporting custom models, tools, memory, middleware, and more. It's designed for multi-turn conversations, chain-of-thought reasoning, and structured output applications.
Microsoft's family of open-source frontier voice AI models including both Text-to-Speech (TTS) and Automatic Speech Recognition (ASR) models, designed for long-form audio processing with multilingual support.
Trinity-RFT is a general-purpose, flexible and user-friendly framework for LLM reinforcement fine-tuning (RFT). It decouples RFT into three coordinated components: Explorer, Trainer, and Buffer, enabling users with different backgrounds to train LLM-powered agents for specific domains.
Page 1 / 4 · 40 total
Get the latest AI tools and trends delivered straight to your inbox. No spam, just intelligence.