AI Due Diligence: What to Look for When Acquiring AI-Native Companies

Understanding AI Due Diligence in M&A

Acquiring an AI-native company is fundamentally different from traditional software acquisitions. You're not just buying code and customers—you're acquiring a system of models, data pipelines, inference infrastructure, and talent that must integrate seamlessly into your portfolio's operational reality. The difference between a sound acquisition and a value-destroying mistake often comes down to what you assess in the first 60 days.

AI due diligence requires a dual-track evaluation: technical maturity and business viability. Most investors focus on the latter—TAM, unit economics, customer concentration. But the technical layer determines whether the company can actually deliver on its promises at scale. A target company might have impressive ARR and strong customer retention, but if its AI systems are brittle, data-dependent, or engineered for research rather than production, you're acquiring technical debt masquerading as revenue.

This article walks you through a structured approach to evaluating AI-native targets. We'll cover the technical assessment frameworks that separate genuinely production-ready systems from pilot-grade implementations, the data infrastructure questions that predict scaling challenges, the talent and process signals that indicate whether the team can maintain and evolve their AI systems, and the governance and compliance layers that determine integration risk. By the end, you'll have a repeatable framework for assessing AI maturity that goes beyond the standard M&A playbook.

The Production-Ready Test: Distinguishing Pilots from Production Systems

The most common mistake in AI acquisitions is conflating "working in the lab" with "working in production." A model that achieves 92% accuracy on a validation set and a model that maintains 92% accuracy across 10,000 daily inference requests in a live environment are not the same thing. Production AI systems have latency constraints, cost constraints, drift constraints, and failure modes that research-grade systems never encounter.

When you're evaluating a target, the first question is: what fraction of their revenue is actually driven by AI systems running in production? Not pilots. Not POCs. Not "we're planning to deploy this in Q2." Production systems that are currently generating revenue and making decisions in real-world environments.

Ask for a production system audit. Have the target's engineering team walk you through their highest-revenue AI application. Document the following:

Latency and throughput characteristics: How long does a single inference request take? For a customer-facing application, this should be sub-500ms. For back-of-house automation, it might be acceptable to be slower, but you need to understand the constraint. Ask about tail latencies (p95, p99)—the average is often misleading. If the average inference time is 200ms but the p99 is 5 seconds, you have a problem. What's the current throughput? If they're running 10,000 requests per day and the system can handle 50,000, that's fine. If they're at 45,000, you're one customer expansion away from a crisis.

Cost per inference: This is the heartbeat metric for AI economics. Calculate the total cost of ownership for their AI infrastructure (compute, storage, model serving, monitoring) and divide by the number of monthly inferences. A well-optimised system should cost between $0.0001 and $0.001 per inference for most applications. If they're above that, either their inference volume is too low (a scaling problem) or their system is inefficient (an engineering problem). Either way, it's a red flag.

Model performance tracking in production: Do they have continuous monitoring of model accuracy, drift, and business outcomes in production? This is non-negotiable. If they can't tell you what their model's accuracy was last month versus this month, they're flying blind. Specifically, ask about:

Baseline drift detection: Are they tracking whether the input data distribution is changing? This is the leading indicator of model performance degradation.
Outcome tracking: Are they measuring whether the model's predictions actually correlate with business outcomes? A model might maintain statistical accuracy while the business impact degrades because the environment has shifted.
Retraining cadence: How often do they retrain? Weekly? Monthly? Annually? The answer tells you how much operational overhead the system requires and how quickly they can respond to performance degradation.

Companies like Brightlume that specialise in production AI deployments typically implement automated retraining pipelines and continuous evaluation frameworks as standard. This is the benchmark you're looking for—not just that they monitor performance, but that they've automated the response to performance changes.

Failure modes and graceful degradation: What happens when the model fails? Does the system fall back to a heuristic? A previous version? Does it alert an operator? Or does it return a null response and break the customer's workflow? Production systems should never fail silently. Ask for their incident logs from the past 12 months. How many model-related incidents were there? How long did they take to resolve? What was the customer impact?

This production-ready test is your first filter. If the target can't demonstrate that their core AI systems are genuinely production-grade—with latency guarantees, cost efficiency, performance monitoring, and failure handling—you're not acquiring a proven business model. You're acquiring a team and a dataset with the hope they can eventually build a production system. That's a different investment thesis entirely, and it carries significantly more risk.

Technical Architecture Assessment: The Engineering Foundations

Once you've confirmed that the target has genuine production AI systems, the next layer is understanding the technical architecture. This is where many deals go sideways post-acquisition. A target company might have built their AI system in a way that works perfectly for their current scale and customer base but becomes a bottleneck or liability when integrated into your portfolio.

Request access to their technical architecture documentation. Specifically, you want to understand:

Model selection and architecture decisions: What models are they using? Are they fine-tuned versions of foundation models like Claude Opus or GPT-4, or have they built custom architectures? For most business applications in 2025, leveraging best-in-class foundation models with task-specific fine-tuning is the right approach. Custom architectures built from scratch are a red flag unless they're solving a genuinely novel problem. Ask why they made the choices they did. If the answer is "we built it three years ago before foundation models were good," you've identified a technical debt item.

Model serving infrastructure: How are they serving models? Are they using a managed service like Anthropic's API or OpenAI's API, or are they self-hosting? Self-hosting can be more cost-effective at scale, but it requires significant operational overhead. If they're self-hosting, you need to understand their containerisation strategy (Docker, Kubernetes), their scaling approach (horizontal, vertical, auto-scaling rules), and their disaster recovery plan. Self-hosted systems that aren't properly containerised and orchestrated become integration nightmares post-acquisition.

Data pipeline architecture: This is critical. How does data flow from the customer environment into the model, and how are predictions returned? Is it batch processing, real-time streaming, or a hybrid? Are they using a message queue (Kafka, RabbitMQ) or direct API calls? The architecture here determines whether the system can be integrated into your existing data infrastructure or whether you need to build new connectors. Ask about data validation—do they validate incoming data before feeding it to the model? If not, they're vulnerable to adversarial inputs or data quality issues from customer systems.

Feature engineering and preprocessing: This is where most AI projects accumulate technical debt. Ask for their feature engineering code. Is it documented? Is it version-controlled? Is it reproducible? If a junior engineer can't understand and modify the feature engineering pipeline, you have a problem. The worst case is when feature engineering is baked into a notebook that only one person understands. That's a key person risk, and it's a serious one.

Model versioning and experimentation framework: Do they have a systematic way to track model versions, run A/B tests, and roll out new models? Or is it ad-hoc? Production AI systems should have a clear versioning scheme, automated testing for new model versions, and a rollout strategy that allows them to deploy new models without taking down the system. If they don't have this, you're acquiring a system that can't evolve safely.

The technical architecture assessment should give you a clear sense of whether the target has built a system that can be maintained and extended by a competent engineering team, or whether it's a bespoke, fragile system that's dependent on specific individuals. The former is an asset. The latter is a liability.

Data Quality and Governance: The Foundation of AI Value

AI systems are only as good as the data they're trained on and the data they operate on. This is where due diligence often becomes uncomfortable, because data quality issues are usually hidden. A company might have beautiful dashboards and impressive accuracy metrics, but if those metrics are based on biased, incomplete, or mislabeled data, the whole foundation is unstable.

Start with data inventory. Ask the target to document every dataset they use for training, validation, testing, and production inference. For each dataset, you need to know:

Data source and provenance: Where does the data come from? Is it from customers? Third-party providers? Synthetic? The source matters because it determines whether the data is representative of the production environment. If they trained on historical data from 2022-2023 but are now operating in 2025, there's a distribution shift risk.

Data volume and composition: How many examples are in each dataset? What's the class distribution? If they're training a classification model on imbalanced data without addressing it, their performance metrics might be misleading. Ask for the confusion matrix, not just the overall accuracy.

Data labeling process: If they're using labeled data, who labeled it? Was there inter-annotator agreement checking? If one person labeled the entire dataset, you have a labeling bias problem. If they used crowdsourcing without quality controls, the labels might be noisy. Ask about the labeling error rate—how often do two independent annotators disagree? If it's above 5% for a straightforward task, the data quality is questionable.

Data retention and privacy compliance: How long do they retain data? Is it compliant with GDPR, CCPA, and relevant Australian privacy legislation like the Privacy Act 1988? If they're processing personal data, have they conducted a Privacy Impact Assessment? This is crucial because post-acquisition, you inherit their privacy compliance obligations. If they've been cutting corners, you're acquiring a compliance risk.

Data access controls: Who can access the data? Is it encrypted at rest and in transit? Are there audit logs? If data access is poorly controlled, you have a security risk and a potential compliance issue.

Beyond the data inventory, you need to assess data governance maturity. Ask about their data quality monitoring. Do they track data quality metrics in production? Are they detecting data drift? Here's a concrete example: if their model was trained on data where the average customer age was 35, but in production the average is now 42, the model's performance might degrade. Do they have monitoring in place to detect this shift? If not, they're vulnerable to gradual performance degradation that goes unnoticed until it impacts revenue.

For health system acquisitions specifically—a key audience for Brightlume's agentic health solutions—data governance is even more critical. Clinical data has regulatory requirements, consent restrictions, and quality standards that are non-negotiable. If the target is working with health data, you need to verify they're compliant with relevant health privacy legislation, that their data handling processes meet clinical standards, and that they have appropriate governance structures for data use in clinical decision-making contexts.

Data quality issues are often discovered post-acquisition, and they're expensive to fix. By front-loading this assessment, you can identify red flags early and either address them as a condition of the deal or adjust your valuation accordingly.

Talent and Operational Maturity: The Execution Risk

AI systems are built by people, and those people need to be competent, aligned, and willing to stay post-acquisition. This is where many deals stumble. You acquire a company for its AI capabilities, but the AI capabilities walk out the door because the team doesn't want to work for your organisation.

Start with the team audit. Who are the key technical contributors? For each, document:

Specialisation and irreplaceability: What specific areas do they own? If one person owns the entire model training pipeline and they leave, you're in trouble. Ideally, you want overlap and documentation so that no single person is a bottleneck. Ask about knowledge transfer plans. Have they documented their work? Can a new engineer come in and understand it within a week?

Tenure and engagement: How long have they been at the company? Are they founders, early employees, or recent hires? Early employees often have deeper context and stronger alignment with the company's mission. Recent hires might be less committed. Are they engaged? Ask about their growth trajectory. Have they been promoted? Are they learning? If the team is stagnant and disengaged, they might be looking for an exit.

Hiring and onboarding capability: Can this team hire and train new people? Or is all the knowledge in their heads? If they've successfully onboarded and trained junior engineers, that's a good sign. If they haven't, you need to understand why. Is it because they don't have the time? Because they don't know how to teach? Because the work is genuinely difficult to learn? The answer matters.

Process and documentation maturity: Do they have code review processes? Testing standards? Documentation requirements? Or is it chaotic? Mature teams have processes because they've learned through experience that processes prevent disasters. Immature teams often skip processes to move faster, which works until it doesn't. Ask for examples of how they handle code reviews, testing, and deployment. If the answer is "we don't really do that," you're acquiring operational risk.

Beyond the individual contributors, assess the leadership team. Do they have experienced engineering managers? Product managers? Operations people? Or is it just a bunch of engineers? Experienced leadership is crucial for scaling and integrating into your organisation. If the leadership team is weak, you'll need to bring in your own people, which creates cultural friction and often leads to key departures.

Finally, ask about their hiring plans and growth ambitions. Are they planning to hire? If so, for what roles? Can they recruit in your geography? If the target is in Australia and you're planning to integrate them into a US-based operation, there might be visa and relocation challenges. These aren't deal-breakers, but they're real costs that need to be factored in.

According to AI-driven due diligence frameworks for venture associates, talent assessment is one of the highest-impact areas for predicting post-acquisition success. Teams with strong processes, clear documentation, and distributed knowledge outperform teams with brilliant individuals and no process.

Model Governance and Compliance: Risk Mitigation

AI systems operate in increasingly regulated environments. Financial services, healthcare, insurance, and hospitality all have regulatory frameworks that impact how AI can be used. If the target is operating in a regulated industry without proper governance, you're acquiring a compliance risk.

Start with governance structure. Do they have an AI governance committee? Who's on it? Is it just engineers, or does it include legal, compliance, and business leadership? Mature organisations have cross-functional governance. Ask what decisions require governance approval. Do they review new models before deployment? Do they have criteria for when a model can be used in production? Do they have a process for decommissioning models that are no longer performing?

Next, assess their compliance posture. What regulations apply to their business? Are they compliant? Ask for evidence—audit reports, compliance certifications, legal reviews. If they're in financial services, have they been through a regulatory audit? If they're in healthcare, have they completed a clinical validation? If they're in insurance, have they assessed fairness and discrimination risks? If the answer to any of these is "no," you need to understand why and factor in the cost of remediation.

For AI due diligence in intellectual property contexts, the compliance layer is critical. You need to understand whether the target has properly licensed any third-party models, datasets, or tools they're using. If they've fine-tuned a proprietary model without a license, you've inherited a legal problem. If they've trained on data they don't have rights to, same issue.

Ask specifically about model explainability and transparency. Can they explain why the model made a specific decision? This is important for regulatory compliance and for customer trust. If the model is a black box and they can't explain its decisions, it might not be acceptable in regulated industries. Some models are inherently more interpretable than others. Tree-based models and linear models are interpretable. Deep neural networks are often less so. If they're using opaque models in a regulated context without interpretability tools, that's a governance gap.

Finally, assess their incident response and audit trail capabilities. If something goes wrong with the model—it makes a discriminatory decision, it's compromised by an adversary, it fails catastrophically—can they investigate what happened? Do they have logs? Can they trace the decision back to the training data, the model version, and the input? If not, they can't meet regulatory requirements or defend themselves in a dispute.

Integration Risk Assessment: Technical and Organisational

Even if the target company has solid technical systems, governance, and talent, integration can still go wrong. The target's AI systems need to integrate with your existing infrastructure, data pipelines, and business processes. Misalignment here can destroy value quickly.

Start with infrastructure compatibility. What cloud platforms are they using? AWS, Azure, GCP? What about your existing portfolio companies? If there's a mismatch, you'll need to either migrate (expensive and risky) or run multiple cloud platforms (operationally complex). Ask about their containerisation and orchestration approach. Are they using Kubernetes? Docker? If they're using Kubernetes and your other portfolio companies are using serverless, you have an operational inconsistency.

Next, assess data pipeline integration. How will their data pipelines connect to your existing data infrastructure? If you have a centralised data lake, can they feed data into it? Can they consume data from it? If the answer is "we'll need to build custom connectors," that's fine, but it's a cost and a timeline item. If the answer is "we don't know," that's a red flag.

Assess organisational integration risk. Will the target team report into your existing engineering leadership, or will they remain separate? If they remain separate, you have two engineering cultures that need to coexist. If they integrate, there's a risk of cultural friction and key departures. Neither is inherently wrong, but you need to be intentional about it and plan for it.

Ask about customer integration. How tightly are the target's customers integrated with their AI systems? Are they using APIs? Webhooks? Direct database access? If customers are tightly integrated, you need to ensure that any changes to the AI system don't break customer integrations. This is a backwards compatibility problem that's easy to underestimate.

Finally, assess regulatory and contractual integration. If the target has customers in regulated industries, are there contractual restrictions on how you can use or modify the AI systems? For example, some customers might require that the model not be updated without their explicit consent. If you have 100 customers with different consent requirements, that's operationally complex.

Valuation Implications and Risk Adjustment

Once you've completed the technical due diligence, you need to translate your findings into valuation adjustments. This is where the technical assessment becomes financial.

Start with the production readiness assessment. If the target has genuinely production-ready systems with strong performance metrics, stable revenue, and proven scalability, that's worth a premium. If the target has working pilots but hasn't scaled to production, that's a discount. A rough framework:

Proven production systems with 3+ years of operating history: No discount
Production systems with less than 3 years of operating history: 10-20% discount
Mature pilots generating revenue but not yet production-grade: 30-50% discount
Early-stage pilots: 50%+ discount or pass

Next, assess technical debt. If the target's systems are well-architected, documented, and maintainable, that's a low-risk asset. If they have significant technical debt—poorly documented code, single points of failure, outdated technology choices—you need to factor in remediation costs. A rough estimate: technical debt remediation costs 1-3 months of engineering time per engineer on the team. If the target has 5 engineers and significant technical debt, budget 5-15 engineer-months of remediation.

Data quality issues have financial implications. If the target's data is incomplete, biased, or mislabeled, that impacts model performance and scalability. If you need to invest in data cleaning, labeling, or retraining, that's a cost. A rough estimate: data quality remediation costs $50-200K depending on the scale of the problem.

Talent retention risk should be reflected in valuation. If the target has key person dependencies, apply a discount. A rough framework: for each key person with a single point of failure, apply a 5-10% discount. If you have three key people with single points of failure, that's a 15-30% discount.

Compliance risk is significant. If the target is operating in a regulated industry without proper governance, you need to budget for compliance remediation. This can range from $100K to $1M+ depending on the industry and the gaps. Factor this into your valuation.

Integration risk should also be reflected. If the target's systems are tightly coupled to their current customer base or infrastructure, and integration will be complex, apply a discount. A rough estimate: for each major integration challenge, apply a 5-10% discount.

The key is to be systematic and evidence-based. Don't apply discounts based on gut feel. Document the specific issues you've identified, estimate the cost to remediate them, and adjust your valuation accordingly. This gives you a defensible valuation and provides a roadmap for post-acquisition value creation.

Red Flags and Deal-Breakers

Some issues are so significant that they should be deal-breakers or require fundamental restructuring of the deal.

Single model dependency: If the target's entire revenue is dependent on one model, and that model's performance is degrading, you're acquiring a declining asset. Ask about model performance trends over the past 12 months. If accuracy is declining, retraining is becoming more frequent, or latency is increasing, you have a problem.

No production monitoring: If the target can't demonstrate that they're monitoring model performance in production, they're flying blind. This is a fundamental operational failure. Don't acquire this without a significant discount and a plan to implement monitoring immediately post-acquisition.

Unresolved compliance issues: If the target is operating in a regulated industry and hasn't addressed compliance requirements, that's a serious problem. Regulators take this seriously, and you could inherit fines, restrictions, or requirements to cease operations. Get legal advice before proceeding.

Undocumented systems: If the target's AI systems are poorly documented and dependent on specific individuals, you have a key person risk. If those key people are unwilling to commit to staying post-acquisition, this is a deal-breaker.

Data quality issues that can't be fixed: If the target's training data is fundamentally biased or unrepresentative, and the only solution is to collect new data, that's a major undertaking. Understand the scope and cost before proceeding.

Misaligned incentives: If the target's team is incentivised to hit short-term metrics at the expense of long-term system health, that's a red flag. For example, if they're optimising for model accuracy without considering latency or cost, they might have built an unscalable system.

Practical Due Diligence Framework: A Checklist

To operationalise this assessment, use the following checklist. For each item, assign a risk rating (low, medium, high) and a priority for remediation.

Production Readiness

[ ] Percentage of revenue from production AI systems (target: >80%)
[ ] Latency metrics for production systems (target: <500ms for customer-facing, <5s for back-of-house)
[ ] Cost per inference (target: <$0.001)
[ ] Model performance monitoring in production (automated drift detection required)
[ ] Incident logs and resolution times (target: <1 hour MTTR)

Technical Architecture

[ ] Model selection and rationale documented
[ ] Model serving infrastructure documented and tested
[ ] Data pipeline architecture documented
[ ] Feature engineering reproducible and version-controlled
[ ] Model versioning and A/B testing framework in place

Data Quality and Governance

[ ] Data inventory complete
[ ] Data labeling process documented and quality-checked
[ ] Data retention and privacy compliance assessed
[ ] Data access controls implemented
[ ] Data quality monitoring in production

Talent and Operations

[ ] Key technical contributors identified
[ ] Knowledge documented and transferable
[ ] Code review and testing processes in place
[ ] Hiring and onboarding capability demonstrated
[ ] Leadership team assessed

Governance and Compliance

[ ] AI governance structure in place
[ ] Regulatory compliance assessed
[ ] Model explainability and transparency addressed
[ ] Incident response and audit trail capabilities

Integration Risk

[ ] Infrastructure compatibility assessed
[ ] Data pipeline integration planned
[ ] Organisational integration approach defined
[ ] Customer integration dependencies documented
[ ] Contractual restrictions identified

For each item, if you get a "no" or "unclear," that's a follow-up question. If you get multiple "no"s in a category, that's a risk area that needs attention.

Post-Acquisition Value Creation

Due diligence isn't just about risk assessment. It's also about identifying value creation opportunities. Based on your assessment, you can develop a 90-day integration plan that addresses the gaps you've identified and creates value.

For example, if the target has strong production systems but weak data governance, you can invest in data governance infrastructure post-acquisition. If they have excellent data and models but poor operational maturity, you can bring in operational expertise. If they have great talent but limited infrastructure, you can provide cloud infrastructure and tooling.

This is where Brightlume's 90-day production deployment methodology becomes relevant. Brightlume specialises in taking AI systems from pilot to production at speed, which is exactly what you need post-acquisition. Whether it's optimising model serving infrastructure, implementing governance frameworks, or scaling data pipelines, having a partner with proven production AI expertise accelerates value creation.

The key is to translate your due diligence findings into a concrete 100-day plan with specific milestones, owners, and success metrics. This gives you a roadmap for post-acquisition value creation and helps ensure that the acquisition delivers on its promise.

Conclusion: Making Confident Acquisition Decisions

AI due diligence is complex, but it's not mysterious. By systematically assessing production readiness, technical architecture, data quality, talent, governance, and integration risk, you can make informed acquisition decisions and identify value creation opportunities.

The best acquisitions are ones where the target has already solved the hard problems—they have production systems, strong teams, good data, and clear governance. These are companies you can integrate and scale quickly. The worst acquisitions are ones where the target has impressive pilots but hasn't solved the hard problems, and you're banking on your team to fix everything post-acquisition. That's expensive and risky.

Use the frameworks in this article to assess your targets systematically. Document your findings. Translate them into valuation adjustments. And most importantly, use them to plan your 100-day integration strategy. If you do this well, you'll acquire companies that deliver value quickly and integrate smoothly into your portfolio. If you skip these steps, you'll acquire companies that look good on paper but underperform in reality.

The AI market is moving fast, and acquisitions are an important way to build AI capabilities. But they only work if you're disciplined about what you're actually acquiring and clear-eyed about the work required to make it work. That discipline starts with rigorous due diligence.