Mid/Senior Machine Learning Engineer
About the Role
We are looking for a motivated and enthusiastic Mid/Senior Machine Learning Engineer to join our growing AI and recommendation team. In this role, you'll work on building intelligent recommendation systems using state-of-the-art machine learning and deep learning technologies.
You’ll work with PyTorch, TensorFlow, and Hugging Face models, contribute to deploying ML models using FastAPI, and integrate them with backend systems and vector databases like Milvus. Experience with LLMs, llama.cpp, and LangChain is a strong plus.
Key Responsibilities
- - Build and improve recommendation models using frameworks like PyTorch and TensorFlow.
- - Fine-tune and deploy Hugging Face models for various NLP and vision tasks relevant to recommendation.
- - Integrate ML models into scalable APIs using FastAPI and backend services.
- - Work with LLMs (like LLaMA, GPT, etc.) and frameworks like llama.cpp to enhance model capabilities.
- - Manage and search high-dimensional vectors using Milvus or other vector stores.
- - Implement retrieval-augmented generation (RAG) pipelines using tools like LangChain.
- - Evaluate and tune model performance using offline and online metrics.
- - Collaborate closely with backend engineers, data engineers, and product teams to deliver end-to-end recommendation features.
- - Optimize ML workflows for scalability and efficiency in production environments.
- - Stay up-to-date with the latest in machine learning, LLMs, and recommendation systems research.
Requirements
- - Solid programming skills in Python, especially for ML and backend development.
- - Hands-on experience with deep learning frameworks like PyTorch or TensorFlow.
- - Good understanding of Hugging Face Transformers and how to fine-tune or use pretrained models.
- - Experience building RESTful APIs using FastAPI or similar frameworks.
- - Familiarity with LLMs and tools like llama.cpp.
- - Basic knowledge of vector databases (e.g., Milvus, FAISS) and similarity search.
- - Understanding of recommendation system architectures (collaborative filtering, content-based, hybrid).
- - Knowledge of Git and working in collaborative code environments.
- - Comfortable working in a fast-paced, research-to-production ML workflow.
- - Strong communication and problem-solving skills.
Nice to Have (Not Required)
- - Experience with LangChain and retrieval-augmented generation (RAG) pipelines.
- - Experience working with Milvus in production.
- - Familiarity with Kubernetes or Docker for deploying ML workloads.
- - Interest or background in search and ranking, semantic embeddings, or LLM-based recommenders.
- - Contributions to open-source ML projects or personal ML/NLP projects.
- - Awareness of performance metrics used in recommender systems (e.g., NDCG, MAP, Recall@K).
- - Experience working with RabbitMQ or similar message brokers.
- - Understanding of event-driven architectures and building systems based on asynchronous messaging.