Kamba – State of Data in Finance 2025
Kamba | White Paper

State of Data in Finance 2025:
From Raw Feeds to Agentic Workflows

CIOs and Heads of Data are sitting on rising data budgets, complex vendor stacks, and pressure to “do something” with AI. This paper is our view of where the market really is, where it is stuck, and what a modern, AI-native data workflow has to look like if you want Sharpe-relevant impact instead of another sandbox.

Section 1

Executive Overview

The short version: data spend is still growing, AI has gone from experiment to expectation, but workflows are nowhere near where they need to be. The firms that win will treat data and AI as a single operating system, not two separate projects.

Across 2025 surveys and benchmarks, a consistent picture emerges: alternative data budgets are still expanding or holding steady for nearly 90% of buy-side firms, while market-data budgets grow more slowly but remain highly resilient. At the same time, AI usage in internal workflows has roughly doubled year-on-year and generative/agentic AI is now a CIO agenda item, not a side experiment.

The problem is not lack of data or lack of AI. The problem is workflow. Data teams spend their time negotiating price, cleaning feeds, chasing entitlements, and re-implementing the same due diligence and backtests. PMs and quants wait months for datasets that should be in production in weeks. Compliance teams see rising risk with limited transparency into how AI agents are touching data.

This paper summarizes where the market really is in 2025, what leading firms are struggling with, and how a modern, AI-native workflow can change the economics. It is written from the vantage point of Kamba’s work with data-intensive hedge funds, multi-PM platforms, and asset managers.

Alt-data budgets
~90% ↑ or flat
Most firms expect alternative data spend to increase or stay level into 2026.
Market-data spend
Stable + low-teens % ↑
Market data remains one of the stickiest, least cuttable budget lines.
AI adoption
>90% using GenAI
Most wealth & asset managers now report multiple GenAI use cases in production.

Figures above are directional, synthesized from 2025 industry surveys and reports by Neudata, Coalition Greenwich, BCG, McKinsey, EY, Citi, CFA Institute, Grant Thornton and others.

Section 2

Market Snapshot 2025: Budgets, Stacks, and Structure

The data market in 2025 is bigger, more crowded, and more operationally fragile than it looks from the P&L line “Data & Subscriptions”.

Alternative data. Recent buy-side surveys show:

  • ~89% of firms expect alt-data budgets to increase or stay the same going into 2026.
  • The average respondent reports ~19 alternative datasets in production, with a long tail of firms running 50+.
  • Trials are brutal: in the last two years, most firms only subscribed to <25% of datasets they trialled.

Translation: the alt-data market is still in expansion mode, but signal-to-noise is low and sourcing teams are spending a lot of time trialing datasets that never make it to alpha.

Market data. The picture is different:

  • Budgets are more stable, with most firms planning flat or modest single-digit increases rather than cuts.
  • Buyers report subscribing to ~30 market datasets on average, with large players sitting in the triple digits.
  • Market data is increasingly delivered via cloud and managed services, but usage analytics and rationalization are still immature.

Large benchmark studies consistently show overlapping products, opaque entitlements, and limited visibility into who is actually using what. That is margin leakage, not strategy.

Direction of Travel: Alt vs Market Data Budgets
Directional share of buyers expecting spend to rise or stay flat (2025 → 2026).
Alternative data
~89%
Market data
~80–85%
Typical Production Stack (Directionally)
Average number of datasets in production by type.
Alternative data
~19
Market data
~30

Numbers are rounded and synthesized across multiple 2025 benchmarks; they are meant to be directional, not exact.

Section 3

What Buy-Side Firms Are Struggling With in 2025

If you strip away the marketing, the same four failure modes show up in almost every serious survey: economics, workflow, governance, and culture.

1. Economics: “Data drag” is real.

  • Price negotiations remain the top friction in onboarding new datasets. Buyers see list prices drifting up while proving signal is still expensive.
  • Trials often fail because internal teams can’t get hands on the data quickly enough or can’t tie it cleanly to existing research processes.
  • High renewal rates and sticky contracts mean the default is to keep paying for legacy feeds even when usage decays.

2. Workflow: pipelines are clogged.

  • Typical onboarding times from “interesting dataset” to “live in production” are still 3–6 months in the median case, longer for anything that touches macro, compliance, or legal.
  • Data teams re-implement the same steps (profiling, DQR, backtesting, documentation) on every dataset with limited reuse and automation.
  • PMs and quants often see the data only after the initial window of edge has decayed.

3. Governance & compliance.

  • Scraping rules, privacy regimes, Reg S-P, EU AI Act and local data-residency rules are tightening, not loosening.
  • Many firms are now running both data and AI model diligence – provenance, rights, usage logs, hallucination risk – with tools that were built for static data, not AI agents.
  • Trial environments are still too often treated as “less-regulated sandboxes” instead of governed extensions of production.

4. Culture & human capital.

  • Internal resistance and “pilot purgatory” slow AI industrialization – many respondents know their firm “uses AI,” but cannot explain where it actually touches P&L.
  • Data engineering and data science teams are stretched thin and are forced to prioritize one-off asks over system upgrades.
Top Operational Pain Points
Directional share of buyers citing each as a major issue.
Price negotiations
~60%
Data quality / lack of signal
~40–45%
Limited internal resources
~30–35%
Vendor customer service
~30%

These themes are consistent across 2025 surveys: Neudata on alt & market data; Greenwich, BCG Expand on market-data spend and usage; and various AI/asset-management studies on talent, governance, and change-management challenges.

Section 4

AI & Agentic Workflows: Hype vs Reality

AI is no longer optional – but most firms are still using it as a tactical accelerator, not as the core operating fabric of their data workflow.

Where the industry actually is.

  • Almost all large wealth & asset managers now report multiple GenAI use cases, with internal productivity (summarization, documentation, code generation) as the most common first wave.
  • More than half are experimenting with investment-centric applications – idea generation, signal discovery, portfolio analytics – but only a subset have these wired into production.
  • A growing share are exploring agentic AI (multi-step agents, MCP, tool-calling) to connect AI to real systems, not just chat interfaces.

What’s blocking real impact.

  • Data access: models cannot see internal data, entitlements, or lineage cleanly; every use case becomes a bespoke integration.
  • Governance: risk teams are rightly concerned about how agents source, store, and recombine data; many reports now stress explainability, auditability, and human oversight as core requirements.
  • Economics: AI projects are run as pilots with fuzzy ROI instead of structured programs tied to latency reduction, Sharpe uplift, or unit cost improvement.
How Firms Are Using AI Today (Directionally)
Share of firms reporting each initiative in 2025 surveys.
Internal efficiency (summaries, coding, docs)
~65–70%
Chatbots / analyst assistants
~45–50%
In-house models for data processing
~35–40%
AI-processed data for investment decisions
~30–35%
Leading research in 2025 converges on the same point: the next real wave of value in asset management comes from re-architecting investment workflows around AI and data – not just sprinkling AI into existing processes.

Kamba’s bias is clear: AI has to sit inside the workflow – Smart Search, DQRs, backtests, procurement and reporting – not as a separate “AI project”. That’s the bar we design to.

Section 5

Economics of Status Quo vs. Modern AI-Native Stack

Below is a simple, conservative scenario built with ChatGPT: what happens economically if you move from today’s manual data workflow to an AI-native, agentic stack similar to what Kamba delivers.

Scenario set-up (illustrative only).

  • A $5bn multi-PM / multi-strategy fund.
  • Data stack today:
    • 30 market data feeds at ~$150k each ≈ $4.5m / year.
    • 15 alternative datasets at ~$200k each ≈ $3.0m / year.
    • 12 FTE across data engineering, data science, and sourcing at ~$300k fully loaded ≈ $3.6m / year.
  • Total “data & workflow” run-rate ≈ $11.1m / year.
  • Each year the firm onboards a handful of new alt datasets (assume 5) that, if fully exploited, could each support strategies with an approximate Sharpe ratio of 2.0 on a subset of the book – i.e. genuinely high-quality signals, not noise.

Step 1 – Potential alpha per dataset.

Assume each dataset, when properly integrated, can support enough incremental exposure to generate ~20 bps of incremental annual return on 20% of the book.

  • 20% of $5bn = $1bn. 20 bps on $1bn = 0.002 × 1,000,000,000 = $2m per dataset per year of potential incremental P&L, consistent with a ~2.0 Sharpe strategy at that scale.
  • For 5 such datasets, the “full capture” potential is ~$10m / year.

Step 2 – Status quo capture.

With current workflows (3–6 month onboarding, duplicated work, limited re-use), assume:

  • Only ~60% of onboarded datasets make it into durable production.
  • Time-to-production and coordination issues mean they capture only ~60% of first-year edge before it decays or is broadly arbitraged.

Effective capture ≈ 0.6 × 0.6 = 36% of the $10m potential. That is roughly $3.6m of incremental P&L per year.

Step 3 – AI-native, Kamba-style workflow.

Now assume the firm deploys an AI-native stack with agentic workflows similar to Kamba Analyst:

  • Smart Search + MCP-enabled agents find and connect the right internal / external data in days, not months.
  • DQR, profiling, documentation and backtests reuse templates and code; agents orchestrate the tedious pieces.
  • PMs see interpretable DQRs and backtests in their native environment (e.g. Symphony) early enough to take real risk.

Conservatively, assume this pushes effective capture to ~80% of potential:

  • 80% of $10m = $8m incremental P&L per year.

Step 4 – Vendor and capacity efficiency.

  • Better usage analytics and rationalization cut 15% of overlapping market data costs and 10% of low-value alt data:
    • 15% of $4.5m ≈ $0.675m
    • 10% of $3.0m ≈ $0.300m
    ~$1.0m in direct vendor savings.
  • Automation of repetitive DQR / trial / backtest tasks frees ~25% of data-team capacity (3 FTE). At $300k per FTE fully loaded, that is ~$0.9m of redeployable capacity.
Status quo AI-native stack Delta
Incremental P&L from 5 high-Sharpe datasets $3.6m $8.0m + $4.4m
Vendor spend (market + alt data) $7.5m $6.5m + $1.0m
Redeployable internal capacity $0 $0.9m + $0.9m
Total economic uplift (annual run-rate) $3.6m $9.4m ~$5.8m–$6m / year

All figures above are ChatGPT-calculated, illustrative numbers using simple, transparent assumptions for a representative firm. They are not forecasts, not guarantees, and not investment advice. The point is directional: in a world where you are onboarding a handful of genuinely high-Sharpe datasets, the workflow – not the data price – is now the main driver of economics.

Section 6

What “Good” Looks Like in 2026 – and How Kamba Fits

If you are a CIO or Head of Data, the bar is moving. “We use AI” will not cut it. The standard is: can your data workflow put new, vetted, governed signals in front of PMs fast enough to matter.

A modern target state.

  • Unified smart search across internal, market, and alternative data – not three different interfaces.
  • Agentic workflow from question → Smart Search → DQR → backtest → procurement → reporting, with human sign-off at the right points.
  • Embedded compliance: entitlements, provenance, and usage logs baked into the agent stack, not bolted on after.
  • Vendor rationalization informed by usage and impact, not just renewal dates.

Where Kamba is opinionated.

  • AI is not a separate product – it is the operating system of the data workflow.
  • The unit of value is not “a dataset” but “a traceable research workflow” that PMs can trust and reuse.
  • Symphony, MCP and secure private-cloud / on-prem deployments are the delivery backbone for serious institutions.
Kamba AI Data Analyst
An AI-native, agentic system for data-intensive financial institutions.
  • Smart Search across internal databases, documents, emails, and external data sources.
  • Automated DQR & backtests with human-readable reasoning traces.
  • Procurement & subscription workflow integrated with data vendors and internal governance.
  • Symphony-native UX plus private-cloud / on-prem options with no training on client data.
Agentic AI Financial-grade Governance-first
What to expect in 90–180 days
  • Measurable reduction in time-to-signal for new datasets.
  • Clear view of “dead spend” and overlapping feeds.
  • A governed AI analyst your teams can actually use in day-to-day work.

The details will differ by firm, but the direction is the same: less manual glue work, more focus on strategy and risk-taking.

Section 7

Sources & Further Reading

This white paper synthesizes Kamba’s own work with clients with leading 2025 research on data, AI, and asset-management economics. A non-exhaustive list of sources:

  • Neudata – The Future of Alternative and Market Data 2025 (industry survey on budgets, trials, AI use cases).
  • Eagle Alpha – Alternative Data Report 2025 (alt-data sourcing, vendor trends, macro gaps).
  • Crisil Coalition Greenwich / SIX – Market Data in the Age of AI and related 2025 Market Data Study (market-data spend, cloud delivery, challenges).
  • BCG Expand – Market Data Insights 2025 (benchmarking market-data strategies across 100+ institutions).
  • BCG – From Recovery to Reinvention: Global Asset Management 2025 (margin pressure, data & AI as drivers of operating leverage).
  • McKinsey – How AI Could Reshape the Economics of the Asset Management Industry (AI value pools across research, risk, distribution).
  • Citi Global Data Insights – AI in Investment Management: Beyond Efficiency Gains (transition from operational to investment-centric AI; agentic AI).
  • EY – GenAI in Wealth & Asset Management Survey 2025 (GenAI & agentic AI adoption across 100 WAM firms).
  • Grant Thornton – AI is Transforming Asset Management (global 500-firm survey on AI usage and barriers).
  • CFA Institute – AI in Asset Management 2025 (governance, transparency, fairness and professional standards).
  • Deloitte – 2026 Banking and Capital Markets Outlook (data-center and AI investment trends, macro context).
  • PwC – Next in Banking and Capital Markets 2025 (GenAI priorities and regulatory expectations).
  • Datos Insights – Top Trends in Capital Markets 2025 (platformization, AI, and workflow modernization).

For CIOs and Heads of Data who want to go deeper, this report can be shared as a PDF and as an HTML page on kambagroup.com. If you would like a version customized to your firm’s current stack and constraints, Kamba can produce a tailored diagnostic.

AI is not a separate product – it is the operating system of the data workflow.
  • The unit of value is not “a dataset” but “a traceable research workflow” that PMs can trust and reuse.
  • Symphony, MCP and secure private-cloud / on-prem deployments are the delivery backbone for serious institutions.
  • Kamba AI Data Analyst
    An AI-native, agentic system for data-intensive financial institutions.
    • Smart Search across internal databases, documents, emails, and external data sources.
    • Automated DQR & backtests with human-readable reasoning traces.
    • Procurement & subscription workflow integrated with data vendors and internal governance.
    • Symphony-native UX plus private-cloud / on-prem options with no training on client data.
    Agentic AI Financial-grade Governance-first
    What to expect in 90–180 days
    • Measurable reduction in time-to-signal for new datasets.
    • Clear view of “dead spend” and overlapping feeds.
    • A governed AI analyst your teams can actually use in day-to-day work.

    The details will differ by firm, but the direction is the same: less manual glue work, more focus on strategy and risk-taking.

    Section 7

    Sources & Further Reading

    This white paper synthesizes Kamba’s own work with clients with leading 2025 research on data, AI, and asset-management economics. A non-exhaustive list of sources:

    • Neudata – The Future of Alternative and Market Data 2025 (industry survey on budgets, trials, AI use cases).
    • Eagle Alpha – Alternative Data Report 2025 (alt-data sourcing, vendor trends, macro gaps).
    • Crisil Coalition Greenwich / SIX – Market Data in the Age of AI and related 2025 Market Data Study (market-data spend, cloud delivery, challenges).
    • BCG Expand – Market Data Insights 2025 (benchmarking market-data strategies across 100+ institutions).
    • BCG – From Recovery to Reinvention: Global Asset Management 2025 (margin pressure, data & AI as drivers of operating leverage).
    • McKinsey – How AI Could Reshape the Economics of the Asset Management Industry (AI value pools across research, risk, distribution).
    • Citi Global Data Insights – AI in Investment Management: Beyond Efficiency Gains (transition from operational to investment-centric AI; agentic AI).
    • EY – GenAI in Wealth & Asset Management Survey 2025 (GenAI & agentic AI adoption across 100 WAM firms).
    • Grant Thornton – AI is Transforming Asset Management (global 500-firm survey on AI usage and barriers).
    • CFA Institute – AI in Asset Management 2025 (governance, transparency, fairness and professional standards).
    • Deloitte – 2026 Banking and Capital Markets Outlook (data-center and AI investment trends, macro context).
    • PwC – Next in Banking and Capital Markets 2025 (GenAI priorities and regulatory expectations).
    • Datos Insights – Top Trends in Capital Markets 2025 (platformization, AI, and workflow modernization).

    For CIOs and Heads of Data who want to go deeper, this report can be shared as a PDF and as an HTML page on kambagroup.com. If you would like a version customized to your firm’s current stack and constraints, Kamba can produce a tailored diagnostic.