Consulting / Core ML & On-Device AI

Core ML & On-Device ML Consulting

Vision, Core ML, Natural Language, on-device voice. Models that run locally without the latency, privacy cost, or cloud bill of a server round-trip. I've built three of my own apps whose entire value is on-device inference.

  • on-device Vision and Core ML pipelines with sub-100ms latency budgets
  • PyTorch/TensorFlow → Core ML conversion, quantization, model size reduction
  • on-device voice AI with ElevenLabs fallback for real-time conversations
Recognition
App Store Best New Apps 2026 Product Hunt Product of the Day 2025 CES Best of Innovation 2021 CES Innovation Award 2021 Webby Honoree 2021 Google Material Design 2020
Credentials
Member of British Computer Society 2024 BEng (Hons) 2017 Apple WWDC Scholarship 2015

Tell me what you're working on, or grab a free 30-min scoping call. I reply within 48 hours.



"Vadim was instrumental to the success Epsy enjoyed on iOS, taking it from an idea on a Miro board to the highest rated and most downloaded app of its kind on the store."

James C. · Mobile Engineering Lead, Epsy

"We had a strict deadline, and Vadim managed to complete the job in time. He gave us meaningful feedback and suggested better approaches, not trying to blindly stick to our specification."

Founder · Pre-seed streaming service

"I can say with confidence that it will be difficult to find a better developer. Vadim is achievement-oriented, highly organized, with very good communication skills."

Alex Z. · Co-Founder, eda.so



Vision frameworkobject detection, segmentation, landmark detection, text recognition, barcode detection, plus the rotation and orientation quirks that come with them.
Core ML model integrationfrom a .mlmodel file to a feature that ships, including the 'how do we know it's still working after iOS 19 shipped' part.
Model conversionPyTorch and TensorFlow to Core ML via coremltools, including the layers that don't convert cleanly.
Quantization & size reductionpost-training quantization, palettization, delta compression to keep app size sane.
Natural Language frameworktokenisation, language identification, tagging, custom word embeddings.
On-device LLMswhat's feasible today, the MLTensor path vs llama.cpp-style alternatives, when Foundation Models covers it and when you need your own.
Fallback strategywhen the on-device path fails (old device, corrupted cache, new iOS regression), what ships instead.

Advisory
£110
per hour

Architecture reviews, hiring help, second opinions on that thing that's been bugging you.

Available now
Retainer
£4,000
per month

Priority support: review agency code, join architecture calls, catch problems before they ship.

Apr '26 May '26 Jun '26

How do I decide between on-device and server-side?

On-device wins when any of these apply: the feature must work offline, latency needs to be under ~100ms, per-inference cost matters, or you can't send the data off-device. If none apply, server-side is usually cheaper and more maintainable.

Will the model fit in our app?

Almost always yes. The sharper question is whether it still fits after the next three features on the roadmap. I review your binary budget and model options before you commit.

How do you handle model updates?

Two clean options: bake the model into the binary (every update is an App Store cycle) or fetch at runtime with integrity checks, caching, and a fallback path. Teams that pick neither and combine both end up with bugs neither approach would produce on its own.

How do I get a quote?

Two paths. If you need speed, send me a detailed brief and I'll quote from it (usually within 48 hours). If you'd rather talk first, book a free 30-minute scoping call and I'll quote after. Most clients who pick the brief path land on the call anyway once we get into the specifics, but the door is open either way.

How quickly can you start?

Advisory calls can happen within days. For project work, I typically need 1-2 weeks notice to clear the calendar, though I keep some buffer for urgent firefighting. Check the availability badges above for current openings.

Do you work with early-stage startups?

Yes, from pre-seed to Series C and beyond. For very early teams, the advisory tier often makes more sense than project work: you get architecture guidance without committing to a large engagement before you've validated the product.

What's included in the day rate?

Everything: code, architecture decisions, code review, documentation, async Slack availability during working hours. No surprise add-ons. I bill for time spent working on your project, not for "thinking about it in the shower."

How do you handle timezone differences?

Currently in Vancouver (PST) with full overlap for North American teams. For UK and Europe, I'm online by their afternoon. For Gulf or APAC, we'd agree on overlap hours and handle the rest async. I've worked with teams from San Francisco to Dubai.


Where I've worked CV · LinkedIn

Drobinin Limited Founder · 2025–present 12+ apps from idea to App Store. Featured by Apple in EMEA & Americas.
LivaNova (NASDAQ: LIVN) Senior iOS · 2020–2025 Epsy, an epilepsy management app. Shipped inside an FDA-regulated medical-device company. HIPAA, CES Innovation Award.
Sphere (acquired by Twitter/X) Senior iOS · 2017–2020 Early Employee. $30M funding to acquisition.
VK.com iOS Consultant · 2016–2017 Authored & delivered an onsite course on iOS development.
ToBox Lead iOS · 2015–2016 Built team, MVVM architecture, full Swift rewrite.

Shipping an on-device ML feature?

Describe what you're working on, or book a free 30-min scoping call. I reply within 48 hours.

work@drobinin.com Book a free call →