DISCOVER THE FUTURE OF AI AGENTSarrow_forward

Rankify

calendar_todayAdded Feb 24, 2026
categoryAgent & Tooling
codeOpen Source
PythonWorkflow AutomationPyTorchTransformersRAGSDKNatural Language ProcessingAgent & ToolingModel & Inference FrameworkDeveloper Tools & CodingKnowledge Management, Retrieval & RAGEducation & Research Resources

A modular Python toolkit developed by the University of Innsbruck that integrates information retrieval, re-ranking, and RAG generation, featuring 40+ pre-processed datasets and single-line pipeline construction.

Overview#

Rankify is an open-source Python toolkit developed by the Data Science team at the University of Innsbruck (DataScienceUIBK), designed to address the fragmentation of IR and RAG toolchains. The project unifies document retrieval, re-ranking, and Retrieval-Augmented Generation (RAG) in a single framework. Current version: v0.1.4, released under Apache-2.0 license.

Core Capabilities#

Retrieval#

  • Sparse Retrieval: BM25
  • Dense Retrieval: DPR, ANCE, ColBERT, BGE, Contriever, BPR, HYDE
  • SOTA Retrievers: SFR, E5, GritLM, M2, Nomic, Instructor, RaDeR, ReasonIR, BGE-Reasoner, ReasonEmbed, DiverRetriever
  • Pre-built Indices: Wikipedia and MS MARCO corpora

Re-ranking#

Integrates 24+ state-of-the-art re-ranking models:

  • Cross-Encoders
  • RankGPT / RankGPT-API
  • MonoT5, MonoBert, RankT5
  • LiT5Score, LiT5Distill
  • Vicuna Reranker, Zephyr Reranker
  • FlashRank, InRanker
  • Transformer Reranker (bge-reranker, mxbai-rerank, gte-multilingual, etc.)
  • API Services (Voyage, Jina, Mixedbread.ai)

RAG Generation#

  • Generation Strategies: Zero-shot, Basic-RAG, Chain-of-Thought-RAG, FiD (Fusion-in-Decoder), In-Context Learning RALM
  • LLM Backends: Hugging Face, vLLM, LiteLLM, OpenAI

Datasets & Evaluation#

  • 40+ Pre-retrieved Benchmark Datasets: NQ, TriviaQA, HotpotQA, FEVER, ELI5, PopQA, Musique, StrategyQA, BoolQ, WebQ, etc.
  • Each dataset contains 1,000 pre-retrieved documents
  • Evaluation Metrics: Recall@k, Precision@k, MRR, nDCG, MAP
  • RAG Evaluation: Integrated RAGAS framework

Architecture#

Modular Design#

  • rankify.retrievers: Multiple retriever implementations
  • rankify.models.reranking: Unified re-ranking interface
  • rankify.generator: RAG generator
  • rankify.dataset: Dataset management and loading
  • rankify.metrics: Evaluation metrics
  • rankify.agent: AI-assisted model selection (RankifyAgent)
  • rankify.server: REST API server
  • rankify.integrations: Framework integrations

Pipeline API#

Single-line pipeline creation:

from rankify import pipeline

# Complete RAG pipeline
rag = pipeline("rag", retriever="bge", reranker="flashrank", generator="basic-rag")
answers = rag("What is machine learning?", documents)

# Other pipeline types
pipeline("search")  # Document retrieval only
pipeline("rerank")  # Retrieval + re-ranking

Deployment#

REST API Server#

rankify serve --port 8000 --retriever bge --reranker flashrank

Python API Deployment#

from rankify.server import RankifyServer
server = RankifyServer(retriever="bge", reranker="flashrank")
server.start(port=8000)

Custom Index Building#

rankify-index index data/wikipedia_10k.jsonl --retriever bm25 --output ./indices

Framework Integration#

  • LangChain
  • LlamaIndex
  • Gradio Interactive Interface (Web Playground)

Installation#

# Environment setup
conda create -n rankify python=3.10
conda activate rankify

# PyTorch installation
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu124

# Full installation
pip install "rankify[all]"

# Optional modules
pip install "rankify[retriever]"  # Retrieval functionality
pip install "rankify[reranking]"   # Re-ranking functionality
pip install "rankify[rag]"         # RAG endpoints

Use Cases#

  • Academic Research: Comparative studies of information retrieval and re-ranking methods
  • RAG System Benchmarking
  • QA System Prototyping
  • Enterprise Document Retrieval and Ranking
  • Knowledge Base QA System Construction
  • Multi-model Performance Comparison

Important Notes#

  • Full dataset is approximately 1.48 TB, requiring significant storage and bandwidth
  • Some retrievers (e.g., ColBERT) require specific compilation environment dependencies
  • Recommended: PyTorch 2.5.1 and Python 3.10+

Authors#

Abdelrahman Abdallah, Bhawna Piryani, Jamshid Mozafari, Mohammed Ali, Adam Jatowt (University of Innsbruck)

Related Projects

View All arrow_forward

STAY UPDATED

Get the latest AI tools and trends delivered straight to your inbox. No spam, just intelligence.

rocket_launch