Details
- Microsoft AI launched Harrier-OSS-v1, a family of three open-source multilingual text embedding models: 270M, 0.6B, and 27B parameters, achieving state-of-the-art performance on the Multilingual MTEB-v2 benchmark.
- Harrier uses decoder-only architectures with last-token pooling and L2 normalization, supporting a 32k token context window for long documents, far exceeding traditional 512-1k limits.
- Models are instruction-tuned: queries need a one-sentence task prefix (e.g., for retrieval or similarity), while documents are encoded without instructions, enabling dynamic vector adjustment.
- Supports over 100 languages for tasks like classification, clustering, retrieval, bitext mining, and reranking, powering better web grounding in AI chatbots.
- Announced via Mustafa Suleyman highlighting Bing team's work under Jordi Rib1, upgrading retrieval for agentic AI with improved accuracy and multilingual capabilities.
- Available on Hugging Face as microsoft/harrier-oss-v1 models, marking a shift from BERT-style encoders.
Impact
Harrier positions Microsoft ahead in open-source multilingual embeddings, topping MTEB-v2 and challenging closed models from OpenAI and Cohere with scalable sizes up to 27B and 32k context for RAG systems. This lowers barriers for global AI apps, enhancing retrieval in 100+ languages and pressuring rivals to match instruction-tuned performance. By open-sourcing decoder-only embeddings, it accelerates agentic AI adoption, narrowing gaps in cross-lingual search where Google and Anthropic lag in comparable open releases.
