AI Dev Day: Measuring Developer Productivity

AgentSunrise
AI
development
LLM
agents
productivity

AI Dev Day: how big tech measures AI efficiency in development

On March 15, 2026, Yandex held its second AI Dev Day — a meetup about real-world experience implementing AI tools in development workflows. Speakers included representatives from Yandex, Avito, Ozon, T-Bank, Sber, and Yandex Go. Here are the key takeaways.


1. AI productivity at Yandex — Andrey Popov

  • 57% of engineers use AI tools (in back-end/front-end/mobile — 60–75%), DAU 36%
  • Generated code: 23% in agent mode, 30% including suggestions
  • Total savings: ~42,000 hours/month ≈ 2% of total time (employees’ self-assessment — 30%, but that is overstated)
  • Goal for 2026: grow to 10% savings
  • The focus has shifted from assistants to agent mode: the agent solves the task, and a human joins only when needed — analogous to the “disengagement rate” in autonomous cars
  • 90%+ of the infrastructure is covered by MCP servers (35+ stable ones); top use cases: tracker work, search, data work
  • Information search: the agent reduces deep research time from 20 minutes to 2 minutes
  • Labor market takeaway: professions do not disappear, they merge — an engineer without a narrow specialization already handles tasks from adjacent roles

2. GenAI adoption at Avito — Alexander Lukyanchenko (CTO Architecture & Tech Platform)

  • Main insight: accelerating the entire development cycle (def cycle time) is only 4–5% in the best teams; coding itself is only 32% of an engineer’s time
  • Fine-tuning open models did not pay off — external SOTA models with context deliver better results
  • Main measurement framework: adoption → AI-assisted PRs → cycle time
  • Approach: select a small group of teams with 100% adoption, run “agent retrospectives,” and iterate against the benchmark
  • SVE benchmark (Avito-specific): ~29% of tasks are solved autonomously
  • Agents perform well on automated tests, atomic routine tasks, decomposition, and code review (20–40% of changes based on agent comments vs 65–70% for a human)

3. Code assistants at Ozon — Alexander Lukyanov (ML platform)

  • 1100 developers/day use the agent assistant, 25–30% daily
  • Switching from continue + DeepSeek to Minimax + OpenCode/Cline caused a sharp jump in adoption
  • Code review: ~1500 projects connected, up to 1000 reviews/day
  • Models are updated in days, not months — through abstract “scenario routes” without reconfiguration
  • External models (Claude, GPT) deliver better results on complex tasks, but are not broadly deployed due to code leakage risks

4. Measuring AI in SDLC — Anna Gromova (T-Bank)

  • Framework: DORA + SPACE + DX → a unified “metrics tree” for evaluating code delivery and developer comfort
  • AI assistant in the IDE: adoption 50% among IT employees, 70–75% among those who commit to GitLab
  • Median merge time reduction of 12%, for “ambassadors” (100% adoption) — by 30% over the year
  • Unit test generation increased 4x, the share of test-related requests — 12%
  • Key takeaway: AI does not replace process redesign — if there is a bottleneck in CI/CD or code review, AI simply moves it further downstream

5. Yandex Code Assistant — Sergey Buldyaev

  • A fork of an open-source agent with key enhancements: seamless authentication, one-click access to up-to-date models, MCP on click, a marketplace of presets (an analogue of “linters for agents”)
  • The main challenge was adoption: skepticism was overcome through workshops for 1000+ engineers on real tasks
  • YQL agent: the main problem is that models do not know YQL → solved through a validation dataset (not LLM-as-judge) and tool-calling examples in the system prompt

6. AID — AI for Designers at Sber — Maxim Shvedenko

  • A multi-agent system: three agents (Support, Reviewer, Generator) built on a single knowledge base for the design system — a closed-loop quality process
  • Before: reviewing one screen took 30 min–2 hours, fixing comments took 8 hours, a new screen took 16+ hours
  • Generator: BT → formalization → JSON specification → rendering components in Figma/React from the design system
  • The reviewer slices the mockup into layers; each check type is a separate agent with compressed context

7. SRE + AI at Yandex Go — Alexander Fisher

  • SRE GPT — a multi-agent system for incident analysis: covers almost 100% of 400 incidents/day (previously ~99% were not analyzed at all)
  • Savings: 30 min × 400 incidents = ~200 hours/day on postmortems alone
  • Root cause identification accuracy: ~40–44% — the global benchmark (Microsoft, Meta, Google)
  • Prompts in Russian do not work in SRE: there is no stable terminology → switched to English
  • Prerequisites: own cloud, observability platform, service catalog, dependency graph, event audit

General conclusions

All companies agree on several points:

  1. Adoption is the hardest stage. Technology works, but without training, workshops, and clear security policies, people simply do not start using the tools.
  2. Agent mode matters more than autocomplete. The real impact comes not from IDE suggestions, but from an agent that independently closes tasks.
  3. Measurement must be done correctly. Adoption and “amount of generated code” are not business metrics. Cycle time, merge time, and change fail rate matter.
  4. MCP has become the standard. All teams are building context infrastructure through MCP servers.
  5. SOTA models outperform fine-tuning. Investing in additional training for open models is not cost-effective — external models with context deliver better results.

← All articles

Comments (0)

No comments yet. Start the discussion.

Leave a comment
No registration required

Book a strategy call
for agentic operations

Tell us which workflow you want to improve. We will map feasibility, risks, and the fastest MVP path.

By submitting, you agree to our privacy policy

Contacts

Global Operations

Serving U.S. clients remotely
with private cloud and on-prem options

Strategy calls by request

We respond after reviewing your workflow context.

lamooof@gmail.com

For partnership inquiries

Have a proposal?

Write to us in messengers

© 2025 AgentSunrise