The Workflow Gap: Why More Data and More AI Still Isn't Working — Kamba
Kamba  ·  White Paper  ·  February 2026

The Workflow Gap: Why More Data and More AI Still Isn't Working

CIOs and Heads of Data are sitting on rising data budgets, complex vendor stacks, and pressure to "do something" with AI. This paper examines where the market actually is, what is keeping workflows broken, and what genuine alpha impact requires — beyond pilots.

Section 1

Executive Overview

Data spend is up. AI is a boardroom priority. Workflows are still broken. The firms that pull ahead in 2026 will treat data and AI as a single operating system — not two separate projects sharing a quarterly sync.

The buy-side data market in 2025 presents a surface-level paradox: budgets are expanding, AI adoption has surged, and yet operating margins are compressing and a common theme across 2025 surveys is that data teams feel more stretched, not less, despite the investment. BCG found pre-tax margins fell three percentage points in North America and five in Europe between 2019 and 2023, even as technology spend increased — a signal that spend alone is not the answer.

The four sections that follow move from context to diagnosis to cost. Section 2 frames where budgets, stacks, and market scale actually stand. Section 3 catalogues the failure modes that appear consistently across 2025 and early 2026 research. Section 4 addresses what firms are actually doing with AI versus what they say. Section 5 — the core of this paper — quantifies the opportunity cost of workflow latency across firm sizes — including a full P&L scenario at the $5bn tier. Section 6 covers the 2026 benchmark workflow — what a mature operating model looks like independent of tooling.

The core thesis: workflow is the bottleneck, not data access and not AI access. Firms that are capturing the most value have compressed the discovery → diligence → backtest → production loop and built governance into that loop from the start — not retrofitted it at audit time.

Section 2

Market Snapshot 2025: Budgets, Stacks, and Scale

Alternative data is mainstream. Market data budgets are resilient. Both are growing faster than the workflows required to extract value from them.

Alternative data: from niche to standard infrastructure.

  • Three-quarters of buy-side firms now use non-traditional data sources in their research or investment processes.Coalition Greenwich
  • 90% of private fund respondents in a Feb 2026 survey currently use alternative data — up from 67% in 2024 and 62% in 2023.Lowenstein Sandler, Feb 2026
  • Over two-thirds of those respondents report alt-data budgets exceeding $1M/yr; 89% plan to increase spend, with 96% of those budget increases directed toward AI-related products.Lowenstein Sandler, Feb 2026
  • Nearly two-thirds of buy-side firms expect to increase alternative data spending in the next year — about a quarter of those by more than 10%.Coalition Greenwich

Note: figures differ because they reflect different survey populations and definitions — broad buy-side firms (Coalition Greenwich) vs private fund respondents (Lowenstein Sandler) — and are not directly comparable.

Despite that spending intent, the route from data purchase to portfolio impact remains long. Deloitte's Center for Financial Services notes that fully incorporating a new alternative dataset into the investment decision process — through discovery, diligence, contracting, integration, and validation — can span two to three years. This refers to full institutionalization: governance, scaling, monitoring, and broad adoption across the platform. Time-to-first-production for a specific workflow path is shorter — typically three to six months in practice — and is the figure modelled in Section 5. The bottleneck is almost never access. It is workflow.

Market data: large, persistent, and modernizing.

  • Global spending on financial market data and news reached $44.3bn in 2024, up 6.4% year-on-year.Burton-Taylor / Finextra, 2025
  • Nearly 70% of buy-side buyers expect market data budgets to increase 1–5% in the next 12 months; very few plan cuts.SIX + Coalition Greenwich, Q3 2025
  • Cloud delivery has accelerated sharply: 63% of firms now receive market data via public cloud connectivity, versus just 30% in 2023.SIX + Coalition Greenwich, Q3 2025
  • 65% use real-time market data throughout the trading day, up from 54% in 2024; over three-quarters are seeking more or better historical tick data.SIX + Coalition Greenwich, Q3 2025

Despite cloud migration, spend rationalization and usage analytics remain immature at most firms. A WatersTechnology benchmark found 70% of buy-side firms are looking to outsource at least one aspect of market data management — a signal that internal capabilities are stretched.

Alt-Data Budget Outlook — Next 12 Months
Private fund respondents. Lowenstein Sandler survey, Nov–Dec 2025 (n=107, released Feb 2026).
Market Data: Budget Direction & Cloud Delivery
Buy-side firms. SIX + Coalition Greenwich, Jun–Jul 2025 (n=50).
All bar charts reflect figures from the cited sources, rounded where the source reports approximations (e.g., "nearly 70%"). No figure has been interpolated or estimated beyond source rounding.
Section 3

What Buy-Side Firms Are Struggling With in 2025–2026

Strip away the marketing and the same failure modes appear across every credible survey: integration complexity, governance friction, cost opacity, and a culture that rewards pilots over production.

1. The data itself is not the problem — integration is.

  • 79% of fundamental PMs and analysts say combining data from different sources is the most frustrating challenge when working with alternative data.Exabel / BattleFin, Jan 2025 (n=130, ~$820B AUM)
  • 98% agree that traditional data and official figures are becoming too slow to reflect changes in economic activity — making fast, reliable alt-data onboarding increasingly mission-critical.Exabel / BattleFin, Jan 2025
  • Deloitte describes onboarding a new data vendor as involving thorough due diligence, contract negotiation, and data storage and access rights work — a multi-stage process that routinely stretches across quarters, not weeks.Deloitte CFS

2. AI is adding cost without yet adding proportional value.

  • 81% of fund respondents report seeing cost increases for alt-data products that incorporate AI features — yet this has not yet translated into systematic improvements in workflow speed or signal quality for most buyers.Lowenstein Sandler, Feb 2026
  • Only 16% of asset managers have fully defined an AI strategy and are implementing it throughout their business — despite 66% calling it a strategic priority.BCG, May 2024
  • McKinsey found that asset managers are allocating 60–80% of technology budgets to run-the-business initiatives, leaving a structurally small share for genuine workflow transformation.McKinsey, July 2025

3. Governance and compliance are structural, not optional.

  • The SEC's April 2022 Risk Alert noted that exam staff observed advisers using alternative data without reasonably designed written policies and procedures to address MNPI risks — including ad hoc, inconsistent diligence and no memorialization of that diligence.SEC Div. of Examinations Risk Alert, Apr 2022
  • 85% of investment-industry employers see a need for industry-wide AI standards and ethical guidelines; 82% say the absence of those standards is actively slowing adoption.CFA Institute, Aug 2024 (n=200)
  • The EU AI Act's general date of application is 2 August 2026, with full effectiveness expected by 2027 — putting AI auditability and model oversight on a compliance clock for firms with EU exposure.EPRS AI Act Timeline, Jun 2025

4. "Pilot purgatory" is widespread and measurable.

  • McKinsey's analysis of pre-tax operating margins shows a multi-year decline — three points in North America, five in Europe — despite increasing technology investment, indicating that spend is not compounding into capability.McKinsey, July 2025
  • Only 16% of asset managers have moved beyond strategy declaration to full implementation, while 75% are dedicating capital and people in the short term — a gap that describes pilot purgatory precisely.BCG, May 2024
The SEC's enforcement trajectory matters. Its first enforcement action against an alternative data provider (App Annie, September 2021) focused on misrepresentations about how data was derived and what controls existed. Governance and data provenance are not only workflow best practices — they are regulatory expectations. SEC App Annie enforcement, Sept 2021
Sources: Exabel/BattleFin (Jan 2025, 130 PMs, ~$820B AUM); Lowenstein Sandler (Feb 2026, n=107); BCG (May 2024); McKinsey (July 2025); CFA Institute (Aug 2024, n=200); SEC Risk Alert (Apr 2022); EPRS (Jun 2025).
Section 4

AI & Agentic Workflows: Hype vs Reality

AI is widely deployed. It is not widely working. The gap between "we use AI" and "AI creates measurable portfolio impact" is where most firms are currently stuck.

The adoption picture.

  • 66% of buy-side survey respondents now use AI/LLMs for internal productivity and workflow efficiency — the dominant use case. Only 36% have adopted AI-processed data to optimise investment or trading strategies.Neudata, Feb 2026
  • Among 300 CFOs, CIOs, and portfolio managers surveyed independently by Opinium, AI and generative AI was the most frequently raised topic over the past 12 months — ahead of sustainable investing, thematic strategies, and regulatory change.Index Industry Association, 2024
  • 80% of market data professionals view AI/ML as a key driver of data delivery and consumption over the next 2–3 years. Yet the same study reports 90% see AI's near-term role as primarily a recommendation tool, with humans retaining final decisions.SIX + Coalition Greenwich, Q3 2025

The execution gap.

  • 72% of asset managers expect GenAI to have significant or transformative impact within 3–5 years. 66% have made it a strategic priority. Only 16% have fully defined a strategy and are implementing it throughout their business.BCG, May 2024
  • McKinsey estimates that AI and agentic workflows could deliver value equivalent to 25–40% of an average asset manager's cost base — but only if embedded into redesigned workflows, not deployed alongside existing processes.McKinsey, July 2025
  • Technology spend has not consistently translated into productivity: pre-tax operating margins declined over 2019–2023 even as spend increased, reflecting the limits of bolted-on tooling without workflow redesign.McKinsey, July 2025
AI/LLM Initiatives Taken by Buy-Side Firms
% of respondents. Neudata survey, 2025 results. Productivity dominates; investment-strategy wiring is still limited.
The GenAI Strategy Gap
Asset managers globally. BCG Global Asset Management Report, May 2024.
The 59-point gap between "strategic priority" and "actually implementing" is where workflow infrastructure matters most. AI-native workflow infrastructure is what closes it — not another pilot.
Sources: Neudata Feb 2026 (AI use-type breakdown); SIX + Coalition Greenwich Q3 2025 (market data AI framing); BCG May 2024 (strategy gap); McKinsey July 2025 (cost impact and margin compression); IIA 2024 (topic frequency among CFOs/CIOs/PMs).
Section 5

Alpha Opportunity Lost: The Cost of Every Month of Delay

The cost of slow data onboarding is not the subscription fee. It is the alpha your portfolio never captured because the dataset arrived late. This model measures that cost in one unit: gross alpha opportunity lost, expressed in dollars, by firm size.

Illustrative model  ·  Gross, pre-costs  ·  Not a return estimate  ·  Results vary materially by strategy, universe, and execution
Model methodology note: This is an illustrative translation of workflow delay into dollar terms — not an empirical estimate of realized returns. It is designed to convey order of magnitude. The single metric throughout is alpha opportunity lost: the gross portfolio contribution foregone while a dataset sits outside production. Formula: AUM × coverage % × alpha (bps ÷ 10,000) × (months delayed ÷ 12). Staff and operational costs are real but are not included here — they are additive and noted separately below the table.
All figures are gross and pre-costs. They exclude market impact, turnover, borrow, fees, and signal decay. Net realizable alpha is typically 30–60% of gross. Capacity and crowding further cap contribution at scale. The model sizes the order of magnitude of delay — it does not predict returns.
Conservative tier
10–50 bps/yr
Competitive liquid universe (e.g. large-cap equities). Signals crowd quickly. Model base: 15 bps.
Good / well-fit dataset
50–150 bps/yr
Strong predictive fit, reasonable capacity for the strategy's universe. Achievable where signal differentiation is high. Model base: 60 bps.
Outlier / early-mover
150–300+ bps/yr
Uncrowded signal, niche universe, strong execution edge. Decays as competition catches up — not a steady-state assumption.

Why the range is wide. Realized alpha depends on capacity, signal half-life, crowding, implementation costs, and portfolio role. A Sharpe-based check: a good dataset might add 0.05–0.20 to portfolio Sharpe. At 6–10% vol, that maps to roughly 30–200 bps/yr gross before costs (illustrative; assumes linear approximation and stable vol).

Status quo onboarding
~4 months
Practitioner surveys commonly report 3–6 months to first production. Deloitte notes 2–3 years for full institutionalization — a separate horizon defined in Section 2.
AI-native target
3–6 weeks
Goal state observed in mature implementations. Actual time varies by complexity, data type, and governance requirements.
Delay compressed
~3.25 months
The compressible window per dataset, per cycle — the period during which alpha opportunity is lost.
Deployment coverage
~30% of AUM
Fraction of AUM to which a single dataset is relevant — a subset of sleeves or strategies, not the entire book. Varies 10–50%; 30% used as conservative mid-case.
Firm tier AUM · Datasets/yr Alpha opportunity lost per dataset
Conservative (15 bps) · Good (60 bps) · Gross
Total alpha opportunity lost / yr
Good tier · All datasets · Gross, pre-costs
Small HF$500m–$1bn $750m  ·  4 datasets ~$90k  ·  ~$370k ~$1.5m / yr
Mid-size fund$1bn–$5bn $2.5bn  ·  6 datasets ~$305k  ·  ~$1.2m ~$7.3m / yr
Large fund$5bn–$20bn $10bn  ·  10 datasets ~$1.2m  ·  ~$4.9m ~$49m / yr
Institutional / multi-PM$20bn+ $30bn  ·  15 datasets ~$3.7m  ·  ~$14.6m ~$219m / yr
How to read this table. Each row answers one question: how much gross alpha did this firm fail to capture last year because its datasets onboarded over ~4 months instead of ~3–6 weeks? Formula: AUM × 30% coverage × bps (÷ 10,000) × (3.25 ÷ 12) × datasets per year. The range shown (conservative → good) spans 15 bps to 60 bps gross. At the large and institutional tiers, capacity constraints and signal decay mean net realizable alpha will be a fraction of the gross figure — read these as sizing the problem, not a recovery guarantee.
Staff drag not included above. Manual data operations also consume quant and PM time — typically 20–30% of relevant FTE at $350k–$450k fully loaded. This adds roughly $130k–$1.35m per year depending on firm size, and is additive to the alpha opportunity lost figures above. It is excluded here to keep the model focused on a single, auditable metric.
Alpha Opportunity Lost per Year — Good Tier (60 bps, 30% coverage)
Gross alpha foregone annually from ~3.25 months of onboarding delay across all datasets. Pre-costs.

The compounding effect. Each row above represents one annual cycle. A mid-size fund running at status-quo onboarding speed for three years foregoes roughly $22m in cumulative gross alpha opportunity — before costs, before staff drag, before dead vendor spend on datasets that never reached production. The question is not whether workflow tooling is expensive. It is whether this number justifies re-examining the stack.

Zoomed-out view  ·  Full annual impact scenario

$5bn Fund: Two Types of Cost, Correctly Named

The per-dataset model above measures one thing: alpha opportunity lost — a P&L impact, the return your portfolio didn't earn. Workflow latency also generates a second, distinct cost: operational opportunity cost — the money wasted running an inefficient data operation regardless of whether any signal succeeds. These are different in kind. The table below keeps them separate.

Alpha Opportunity Lost
P&L impact — affects fund returns

The gross portfolio return your fund did not earn because a dataset that could have informed decisions was still being onboarded. This is a return-attribution line — it would appear (or not) in your performance record. It exists only if the delayed dataset had genuine alpha potential.

Operational Opportunity Cost
Cost impact — affects the data budget

The money wasted running an inefficient data operation — overspend on redundant vendor contracts and FTE time consumed by manual workflow that could be compressed or eliminated. This accrues regardless of whether any dataset adds alpha. It is a budget and efficiency loss, not a return loss.

What Drives Operational Opportunity Cost: Internal Data Waste Kamba Can Surface
Three categories of identifiable waste — each with a detection mechanism and a savings path.
  • Redundant datasets — the same concept arriving from multiple sources (internal lake + vendor feed; two vendors with overlapping coverage; legacy + new dataset both active) or the same data copied across multiple stores (Snowflake + S3 + Databricks) with separate compute and egress costs. Kamba identifies candidates via schema/field overlap, semantic similarity, usage duplication, and lineage analysis and pipeline equivalence checks. Savings path: cancel or downgrade one contract; remove duplicate pipelines and refresh jobs.
  • Decayed datasets — still being paid for and processed, but showing deteriorating freshness/quality, shrinking coverage, and declining downstream usage or contribution proxies. Detectable via data freshness indicators (lateness, null spikes, missing partitions), quality drift (distribution shift, coverage shrinkage), and downstream signals (reduced usage in notebooks and research queries, declining feature importance, or performance change on controlled ablation). Savings path: cancel, renegotiate, or replace subscriptions; stop processing datasets that no longer justify their cost.
  • Zombie datasets and pipelines — assets still running because "someone might need them": tables nobody queries, dashboards nobody opens, features nobody trains on, extracts created for a PM who left. Kamba identifies candidates via usage instrumentation across warehouse queries, BI tools, notebooks, model training jobs, feature store reads, and entitlement access logs — scored by time-since-last-use and production dependency. Savings path: deprecate safely using lineage-aware impact analysis, with rollback window and sign-off.
Output Kamba produces (first diagnostic cycle): ranked list of datasets and contracts with (i) annual cost, (ii) usage score, (iii) quality/freshness score, (iv) downstream dependency risk, (v) recommended action. The fastest hard-dollar savings typically come from redundant subscriptions and underused datasets — validated via lineage and usage telemetry before any deprecation or cancellation is actioned. All candidates are validated via lineage/dependency checks and governance sign-off before any deprecation or cancellation is actioned.
Illustrative  ·  $5bn multi-PM fund  ·  5 high-quality datasets in pipeline  ·  Gross, pre-costs

Baseline data run-rate for context.

  • 30 market data feeds at ~$150k each ≈ $4.5m / year.
  • 15 alternative datasets at ~$200k each ≈ $3.0m / year.
  • 12 FTE at ~$300k fully loaded ≈ $3.6m / year.
  • Total data & workflow run-rate ≈ $11.1m / year.
Cost category & type Status quo AI-native workflow infrastructure Delta
Alpha opportunity lost P&L impact · return not earned · 5 datasets, good tier, ~$1.2m each ~$6.0m not earned ~$1.6m not earned ~$4.4m recovered
Operational opportunity cost Cost impact · budget wasted · vendor overspend + manual FTE ~$1.9m wasted ~$0 ~$1.9m saved
Total annual impact P&L recovered + costs saved ~$6.3m / year
How the operational opportunity cost of ~$1.9m is calculated: ~$1.0m vendor rationalization — redundant vendor contracts and internal duplication costs (compute, storage, maintenance across overlapping datasets and pipelines) identified as candidates via the methods above — plus ~$0.9m redeployable staff capacity — FTE hours freed from chasing stale internal data, rebuilding duplicate datasets, and manual lineage and entitlement work. Both are illustrative estimates. All identified candidates are validated via lineage/dependency checks and governance sign-off before any deprecation or cancellation is actioned. The ~$4.4m alpha recovery is the delta between status-quo and AI-native onboarding speed applied to 5 datasets at the good-tier assumption — traceable directly to the per-dataset model above.
Alpha opportunity lost dominates the total (~70% of the $6.3m). This matters for prioritisation: workflow latency is primarily a return problem, not a cost problem. Operational savings are real and worth capturing — but they are not the reason to re-examine the stack. The alpha line is.
Model assumptions: status quo ~4 months to first production; AI-native target 3–6 weeks; delay compressed ~3.25 months; deployment coverage 30% of AUM (range 10–50%); alpha tiers: conservative 15 bps/yr, good 60 bps/yr — gross, pre-costs. Alpha contribution varies materially by asset class, horizon, capacity, competition, and execution. Context: Exabel (79% cite integration as top challenge), Deloitte (2–3yr institutionalization), McKinsey (AI potential = 25–40% of cost base).
Section 6

What a Mature 2026 Workflow Actually Looks Like

The bar is moving from "we use AI" to "we run governed, traceable workflows that pull alpha forward." Here is what that looks like in practice — independent of vendor or tooling choice.

The characteristics that define mature firms.

  • Unified discovery: a single intelligent search layer across internal, market, and alternative data — not three separate interfaces with three separate workflows.
  • End-to-end agentic loop: question → Smart Search → DQR → Backtest → Procurement → Reporting, with human sign-off at defined checkpoints. This is the core of AI-native workflow infrastructure — not separate tools stitched together manually.
  • Governance as infrastructure: entitlements, data provenance, usage logs, and model audit trails are baked in from the start. The SEC's April 2022 Risk Alert makes clear that ad hoc diligence memorialization is an examination risk, not just an operational gap.
  • Usage-driven vendor management: rationalization decisions are driven by actual usage and impact data. A WatersTechnology benchmark found 70% of buy-side firms want to outsource at least one market data management function — signalling that internal capacity is structurally limited.
  • Compounding team capacity: data engineering effort goes into reusable, governed infrastructure — not one-off requests that reset on every hire.

The governance imperative is tightening.

IOSCO's guidance for AI and machine learning in asset management identifies governance and oversight, algorithm testing and monitoring, data quality and bias controls, explainability, and outsourcing risk as key categories requiring designated accountability at senior management level. The EU AI Act's general application date of 2 August 2026 puts auditability on a compliance clock for firms with EU market exposure — NIST's AI Risk Management Framework provides a practical, non-regulatory-specific structure for building trustworthy AI processes alongside it. The EDM Council's DCAM v3 and CDMC frameworks cover AI/cloud governance and 14 key controls for protecting sensitive data (including MNPI) in cloud environments specifically.

What measurable progress looks like in 90–180 days
Signals that workflow maturity is actually improving.
  • Time-to-signal for new datasets falls from months to weeks.
  • Data teams can point to reusable DQR and backtest infrastructure — not just completed tickets.
  • Compliance has clear visibility into which AI agents touched which data and when — addressing the SEC exam standards directly.
  • Vendor rationalization decisions are made on usage data, not renewal calendar pressure.
  • PMs report fewer delays between dataset availability and portfolio impact.
Standards & frameworks anchoring the governance layer
Recognised frameworks mature firms are building to.
  • IOSCO FR06/2021 — governance, testing, monitoring, explainability for AI/ML in asset management.
  • NIST AI RMF 1.0 — voluntary, non-sector-specific trustworthy AI framework.
  • EDM Council DCAM v3 / CDMC — data management for AI/cloud; 14 key controls for MNPI/PII in cloud environments.
  • EU AI Act — general application from 2 August 2026; full effectiveness by 2027.
Kamba's role. Kamba builds AI-native workflow infrastructure for data-intensive institutions — Smart Search, automated DQR and backtesting, procurement workflow, and governance built in. A diagnostic of where your firm sits against this benchmark is available on request.
Section 7

Sources

Every specific statistic cited in this paper is anchored to a primary or well-documented secondary source below, with methodology disclosure where available. Sources are grouped by type.

Primary — Buy-Side Surveys with Disclosed Methodology

  • SIX + Crisil Coalition GreenwichMarket Data in the Age of AI, Q3 2025. Survey of 50 buy-side firms, conducted June–July 2025. Methodology and respondent composition disclosed. Stats cited: ~70% expect market data budgets to rise 1–5%; 80% view AI/ML as a key driver over 2–3 years; 90% see AI primarily as a recommendation tool; 63% now receive market data via public cloud (vs 30% in 2023); 65% use real-time data throughout the trading day (up from 54% in 2024); over three-quarters seek more/better historical tick data.
  • Lowenstein SandlerAnnual Alternative Data Survey 2025, released February 2026. 107 respondents; online survey conducted November 9 – December 8, 2025; private fund managers. Stats cited: 90% currently use alt data (up from 67% in 2024, 62% in 2023); over two-thirds have alt-data budgets exceeding $1M/yr; 89% plan to increase alt-data budget; 96% of those increases directed toward AI; 81% report cost increases for AI-incorporated alt-data products; 89% say vendors are fully/mostly enabling AI analysis.
  • Exabel / BattleFinBuy-Side Practitioner Survey, January 2025. 130 fundamental PMs and investment analysts across US, UK, Singapore, Hong Kong; respondents collectively manage ~$820B AUM. Stats cited: 79% say combining data from different sources is the most frustrating alt-data challenge; 98% agree traditional data is too slow to reflect changes in economic activity; 75% say consumer spending datasets will provide outsized informational edge.
  • CFA InstituteAI and GenAI in Investing: Employer Survey, August 2024. 200 investment-industry representatives; firms ranging from under $5B to over $100B AUM; conducted February 2024. Stats cited: 85% of employers see a need for industry-wide AI/GenAI standards and ethical guidelines; 82% say lack of standards hinders faster adoption; data privacy and security cited as major roadblock.
  • Index Industry Association (IIA)Global Asset Manager Survey 2024. 300 CFOs, CIOs, and portfolio managers; Europe and US; fieldwork April–May 2024; conducted independently by Opinium Research. Stats cited: GenAI/ML was the most frequently raised topic over the past 12 months among respondents, ahead of sustainable investing and thematic strategies.

Secondary — Research & Consultancy

  • BCGGlobal Asset Management Report + GenAI Benchmark, May 2024. Stats cited: 72% expect GenAI to have significant or transformative impact within 3–5 years; 66% have made GenAI a strategic priority; 75% are dedicating capital and people in the short term; only 16% have fully defined a strategy and are implementing it throughout their business.
  • McKinsey & CompanyAsset Management AI Economics, July 2025. Stats cited: pre-tax operating margins declined approximately 3 points in North America and 5 points in Europe (2019–2023) despite rising technology spend; AI/gen AI/agentic AI impact could be equivalent to 25–40% of an average asset manager's cost base; asset managers allocate 60–80% of technology budgets to run-the-business initiatives.
  • Deloitte Center for Financial Services — Alternative Data Process Research. Cited for: journey from discovery to full integration spans multiple years; fully incorporating alternative data into the investment decision process may span 2–3 years; onboarding involves thorough due diligence, contracts, price negotiations, data storage, and access rights.
  • Coalition Greenwich — Alternative Data Adoption Study (press release summary). Stats cited: three-quarters of buy-side firms use non-traditional data sources; nearly two-thirds expect to increase alternative data spending in the next year; about a quarter of those plan to increase by more than 10%.
  • Burton-Taylor Consulting (summarized by Finextra, 2025). Stats cited: global spending on financial market data and news reached $44.3bn in 2024, rising 6.4%. Note: underlying Burton-Taylor report is paywalled; figures cited from Finextra attribution.
  • WatersTechnology / TRG Screen — Market Data Management Benchmark Survey. Stats cited: 70% of buy-side firms are looking to outsource at least one aspect of market data management; 46% use specialist third-party tools. Note: survey was sponsored and commissioned; presented as survey evidence, not an independent census.
  • NeudataThe State of the Alternative Data Market in 2026, February 2026. Author: Daryl Smith, Head of Research. Stats cited: $2.8bn estimated alternative data market size in 2025 (17% YoY growth); 66% of respondents use AI/LLMs for internal efficiency; 36% use AI-processed data for investment strategies; ~19 average datasets subscribed to per buyer per year; average fund spends ~$1.4m/yr on alt data.

Primary — Regulatory & Standards Bodies

  • SEC Division of ExaminationsRisk Alert: Investment Adviser Use of Alternative Data, April 26, 2022. Directly cited: exam staff observed advisers using alternative data without reasonably designed written policies and procedures for MNPI risk; examples include ad hoc and inconsistent diligence and failure to memorialize diligence processes. Anchors governance and provenance claims throughout.
  • SEC EnforcementApp Annie Inc. and Bertrand Schmitt: Securities Fraud Charges, September 14, 2021. The SEC's first enforcement action against an alternative data provider. Focused on misrepresentations about how data was derived and what controls existed. Cited for: data provenance and controls are regulatory expectations, not optional governance choices.
  • IOSCOFinal Report: The Use of Artificial Intelligence and Machine Learning by Market Intermediaries and Asset Managers (FR06/2021). Cited for: key AI/ML risk categories including governance/oversight, algorithm development/testing/monitoring, data quality and bias, transparency/explainability, outsourcing, and ethical concerns; guidance on designated senior management accountability.
  • European Parliamentary Research Service (EPRS) — AI Act Implementation Timeline, June 2025. Cited for: EU AI Act entered into force 2024; general date of application 2 August 2026; AI Act expected to be fully effective by 2027.
  • NISTAI Risk Management Framework 1.0 (AI RMF 1.0). Cited as a voluntary, non-sector-specific, use-case-agnostic resource for designing and deploying AI while managing risks and promoting trustworthy AI. Used as a neutral governance anchor.
  • EDM CouncilDCAM v3 (Data Management Capabilities Assessment Model) and CDMC (Cloud Data Management Capabilities). Cited for: DCAM v3 expanded support for AI and cloud with stronger emphasis on governance, privacy, and protection; CDMC defines 14 Key Controls and Automations for protecting sensitive data (including PII and MNPI) in cloud environments.
If you want an analysis calibrated to your firm's specific stack, entitlements, asset class, horizon, and capacity — including a tighter bps-per-dataset estimate based on your actual strategy — Kamba can produce a tailored diagnostic.