What Do Data Teams Actually Look Like in 2026? A Large LinkedIn Benchmark

Data Science
Benchmarking
AI Teams
I analyzed LinkedIn data on 213 tech companies to benchmark data and AI team composition. Role mix matters more than team size.
Author

Luca Fiaschi

Published

March 14, 2026

“How big should my data team be?” is the question every VP of Data gets asked. It’s also the wrong starting point. Team size depends on so many company-specific factors that a single percentage is almost useless. The more actionable question is: what should your data team look like? What roles do you need? How AI-specialized should it be? When should you hire a data leader?

The best benchmark available was a 2023 study of 100 scaleups, which predated the AI Engineer boom entirely. I scraped LinkedIn to build an updated picture.

How I Collected the Data

I started with about 500 Series B+ tech companies drawn from nine sources (YC Top Companies, Forbes AI 50, a16z/Sequoia/Index/Accel/Lightspeed portfolios, Crunchbase, public listings). If you’d like access to the raw data, get in touch.

For each company, I collected two things from publicly available LinkedIn data: total employee count (from company pages) and data/AI team composition (by searching for roles like “data scientist”, “data engineer”, “machine learning”, and “AI” among each company’s listed employees). I gathered complete data for 213 companies. Of those, 188 had at least one visible data/AI professional. The remaining 25 returned zero results.

There’s an important limitation to this approach. LinkedIn search results are capped at about 12 per query. At a 100-person company, that’s enough to spot most of the data team. At a 5,000-person company, you’re seeing a tiny sliver. This means I can observe which roles exist and what the mix looks like, but I can’t reliably count how many data people a large company has. Keep this in mind when you read this analysis because it limits some of the conclusions. For the “how big” question, SYNQ’s 2023 study (which manually counted at 100 companies) found a 3% median and remains the best size benchmark.

Because of the absolute size limitation above, everything in this post focuses on team composition and role adoption, metrics that don’t depend on seeing every employee. A caveat worth repeating: this sample is VC-backed tech companies. If you’re at a bank, a pharma company, or a government agency, the role mix may look different.

The Data Foundation Still Comes First

Across 188 companies with visible data/AI roles, I found roughly 2,000 professionals. I classified them into two tiers:

Tier 1 (Core Data): Data Scientist, Data Engineer, ML Engineer, Analytics Engineer, Data Analyst. Roles that existed before the AI boom.

Tier 2 (AI/ML Extended): AI Engineer, Research Scientist, MLOps, CV/NLP Engineer, Prompt Engineer. Roles that have spread since 2023.

Role composition across 213 companies

Data Scientists (556) and Data Engineers (418) together make up over half of all classified roles. The overall ratio is 3.6 Tier 1 roles for every Tier 2 role. The AI wave added new role types. It didn’t replace the existing ones.

The Role Adoption Curve

Which roles are table stakes and which are still emerging? I looked at what percentage of companies have at least one person in each role.

Role adoption rates across 188 companies

Data Engineer (72%) and Data Scientist (67%) are near-universal. ML Engineer sits at 57%. Then there’s a clear drop: AI Engineer at 37%, Analytics Engineer and Data Analyst around 30-34%.

The Tier 2 roles map an adoption curve. Over a third of companies already have an AI Engineer, making it the most widely adopted of the post-2023 roles. Research Scientist (16%) and MLOps (12%) are growing but still niche. Prompt Engineer (2%) is barely a category.

If you’re building a data team from scratch: Data Engineer and Data Scientist first. ML Engineer once you have models in production. AI Engineer once AI is a product feature, not a research project.

How Industry Shapes Team Composition

The most useful vertical comparison isn’t team size (confounded by LinkedIn’s visibility cap) but team shape: what fraction of the data team works on AI-specific roles?

T2 share by industry vertical

At AI/ML companies, 43% of visible data/AI roles are Tier 2. At fintech companies, it’s 9%. Enterprise SaaS sits at 11%. DevTools and cybersecurity land in the 18-20% range.

To make sure this isn’t just a size effect (AI/ML companies tend to be smaller), I compared verticals within the same size band. Among mid-size companies (201-1,000 employees, n=68), where LinkedIn visibility is roughly equal:

Vertical n Median Roles T1 (Core) T2 (AI) T2 Share
AI/ML 16 12 7 6 46%
Fintech 8 8.5 8 0 0%
Cybersecurity 7 8 7 2 21%
DevTools 8 7.5 5 2 23%
Enterprise SaaS 11 7 5 1 6%
Data Infra 15 5 3 2 33%

The fintech column is worth pausing on. At the same company size, fintech teams have roughly the same total data headcount as cybersecurity or DevTools companies, but zero AI-specific roles (with the caveat that n=8). Their data teams are entirely traditional: Data Scientists, Data Engineers, Analysts. Meanwhile, at similarly-sized AI/ML companies, almost half the data team works on AI-specific roles.

This isn’t about some companies investing more in data than others. The total headcount is similar. It’s about what those people do. At an AI company, the data team builds the product. At a fintech company, it supports analytics and risk models. Yes, this is partly tautological: of course AI companies have more AI roles. Keeping in mind the data collection limitations above, this suggests that the total data team size is comparable across verticals, and that the difference is almost entirely in the AI-specific tail.

AI-Native Companies Are Built Differently

I classified 43 of the 213 companies as “AI-native” (AI/ML core product, founded after 2018). This classification is based on the company’s primary product, not just whether they use AI internally. Controlling for company size, the composition differences hold up.

AI-native vs traditional team composition

Among small companies (50-200 employees), AI-native companies (n=21) have a median T2 share of 40%, while traditional companies (n=19) sit at 0%. Among mid-size companies (201-1K), AI-native companies (n=11) have a 55% T2 share vs 20% for traditional (n=57).

The company-level examples make this concrete. Traditional tech companies are almost entirely Tier 1: Spotify (31 T1, 2 T2), Atlassian (28 T1, 2 T2), Snowflake (24 T1, 2 T2). AI-native companies flip the ratio: Luma AI (3 T1, 12 T2), Mistral AI (1 T1, 10 T2), Aleph Alpha (4 T1, 9 T2). At these companies, research scientists and AI engineers outnumber traditional data roles.

When AI is the product, the data team is the product team.

When Data Leadership Appears

Of the 213 companies, 60 (28%) have identifiable data/AI leadership on LinkedIn (Head of Data, VP of AI, CDO, or similar).

Leadership presence by funding stage

The inflection point is clear. At Series B and C, only 10-12% of companies have a formal data leader. At Series D, it jumps to 33%. Series E+ and public companies are at 45-48%.

AI-native companies formalize data leadership much earlier. Among small companies (50-200 employees), 17% of AI-native companies already have a data leader vs just 2% of traditional companies (though both groups are small, so treat these as directional).

Within the mid-size band (201-1K employees), leadership rates are more uniform across verticals: AI/ML at 31%, cybersecurity at 43%, devtools at 50%. The overall differences by vertical are largely driven by company size (bigger companies are more likely to have formalized leadership), not by industry.

The most common titles: Head of Data Science, VP of AI, Chief AI Officer, Head of Data Platform. The CDO title, which dominated a few years ago, has been partly displaced by AI-specific titles. My own title at Mistplay, “Chief Data & AI Officer,” was an example of this convergence: the role covered AI, data platform, and analytics, because when data systems directly drive revenue, AI, platform, and analytics need to be tightly coordinated.

Practical Guidance

If you’re building or benchmarking a data team at a VC-backed tech company:

  • Data Engineers and Data Scientists are table stakes. Present at 72% and 67% of companies respectively. Hire these first.
  • ML Engineer is the bridge role. At 57% adoption. This is the role that takes data science from notebooks to production.
  • AI Engineer is already widely adopted. At 37% of companies, and far higher at AI-native companies. If AI features are on your product roadmap, this hire is next.
  • Your AI role mix should match your product. If AI is your core product, expect 40-50% of your data team to be Tier 2. If AI supports your product, 10-20% is normal. If you’re in fintech, you might not need AI-specific roles at all right now.
  • At the same company size, vertical matters for role mix but not team size. Mid-size companies across verticals employ roughly the same number of data people (5-12). The difference is in what they do, not how many there are.
  • Formalize data leadership early, especially if you’re AI-native. AI-native companies that hired a data leader by Series B are common in our data (21%), while traditional companies at the same stage almost never do (4%). If data drives your product or revenue, waiting is not an option.

The interactive explorer lets you search all 213 companies. If you’d like access to the full raw LinkedIn data, get in touch.


Methodology: ~500 companies curated from 9 sources. Employee counts and data/AI team composition collected from publicly available LinkedIn profiles. 6 role-keyword searches per company. Complete data gathered for 213 companies. Roles classified into Tier 1 (Core Data) and Tier 2 (AI/ML Extended) by keyword matching on titles. LinkedIn visibility is capped at ~12 results per query, so role counts are lower bounds and larger companies are undercounted. This study reports team composition and role adoption metrics, which are unaffected by the visibility cap. The sample is VC-backed tech companies and may not generalize to other sectors.