Head of AI/ML
Head of Research (AI Evaluation)
Vals AI · AI Infrastructure / AI Evaluation & Benchmarking · 📍 San Francisco Bay Area
Vals AI needs a founding research leader to define the science of LLM evaluation — building the benchmarks and methodologies that determine which AI models get trusted and deployed at scale.
Key responsibilities
- Advance the science of LLM evaluation — develop new paradigms beyond judge models, static benchmarks, and HITL for long-horizon real-world tasks
- Oversee Vals' full research portfolio, setting direction across active and future projects
- Publish high-impact research intended to shape field-wide methodology
- Recruit, build, and lead the research team from near-zero
- Partner directly with enterprise customers and frontier lab partners on applied evaluation problems
Requirements
- PhD in ML/NLP (completed or in progress) or equivalent frontier industry research track record
- Deep expertise in LLM evaluation landscape: benchmarks, failure modes, judge-model approaches, HITL methodologies
- Research orientation toward real-world deployability over easily-gamed benchmarks
- Strong written and verbal communication for publishing, presenting, and customer/lab dialogue
- Ability to work full-time onsite in San Francisco
Signals
- ⚠ No salary range disclosed despite 'highly competitive' claim
- ⚠ Company description cut off mid-sentence ('About Us: Fo...') — incomplete posting
- ⚠ No previous postings make traction and culture hard to independently verify
- ⚠ Strict onsite-only requirement significantly limits candidate pool
- ✓ Clear, intellectually coherent product thesis — evaluation as infrastructure is a credible high-value wedge
- ✓ Explicit acknowledgment of existing research portfolio and enterprise + lab partnerships suggests real traction
- ✓ Strong equity offer implied for early research leader at founding-stage company
- ✓ Research ambition is field-level, not just product-level — rare mandate for a commercial role
- ✓ Relocation support offered, signaling willingness to invest in the right candidate
- ✓ Full benefits stack including meals, health, 401K