Models.

Open-source ML models trained for multilingual semantic search. Fine-tuned for real-world code-mixed text that people actually use.

NLP • Semantic Search🤗 View on Hugging Face

Marathlish MiniLM

A bilingual semantic search model fine-tuned on mixed Marathi-English (Marathlish) text. Built for search applications where users naturally mix Marathi and English — the way people actually speak and type in Maharashtra.

→Fine-tuned on curated Marathi-English code-mixed dataset
→Based on MiniLM-L6-v2 architecture for fast inference
→Designed for semantic search, not just keyword matching
→Works with Sentence Transformers API out of the box

marathienglishsemantic-searchsentence-transformersmultilingual

NLP • Semantic Search🤗 View on Hugging Face

Hinglish MiniLM

Semantic search model for Hindi-English (Hinglish) code-mixed queries. Trained to understand the natural mixing of Hindi and English that 500M+ speakers use daily across India. Optimized for real-world search applications.

→Hindi-English bilingual semantic embeddings
→Handles Romanized Hindi (Devanagari + Latin scripts)
→Lightweight MiniLM architecture — runs on CPU
→Cross-lingual retrieval support

hindienglishhinglishsemantic-searchsentence-transformers

Quick Start

from sentence_transformers import SentenceTransformer

# Load the model
model = SentenceTransformer("anuragwagh0/marathlish-minilm")

# Encode queries
queries = ["माझा phone कुठे आहे?", "best restaurants in Pune"]
embeddings = model.encode(queries)

# Use for semantic search, clustering, etc.
print(embeddings.shape)  # (2, 384)