AI in Platform Solutions – Already a Real Asset Today
Platform solutions form the technological backbone of many modern companies. They provide a central technical foundation with standardized interfaces, modular components, and unified processes. In doing so, they enable a scalable infrastructure on which various services, products, or teams can collaborate efficiently and consistently. Platform solutions reduce redundant work, foster collaboration, and make it possible to deliver new applications more reliably. At the same time, platform solutions typically store and process large volumes of data, which means that clear responsibilities and the protection of sensitive data must be considered from the outset. Building such platforms is therefore complex – it includes design, development, operations, and continuous improvement. This is where the use of Artificial Intelligence (AI), especially Generative AI (GenAI), comes into play.
In recent years, Generative AI has advanced at a rapid pace. Large Language Models (LLMs) and multimodal systems capable of processing text, code, images, and speech – such as GPT, Claude, Gemini, or Llama – are revolutionizing the way people interact with technology. Unlike traditional AI/ML systems, which rely heavily on specifically trained models for individual tasks, GenAI can dynamically generate context-aware content, make suggestions, or recombine existing information in new ways.
These capabilities open up enormous potential for the development and operation of platform solutions – from requirements analysis and code generation to automated operational monitoring. At the same time, the use of such technologies introduces new challenges and questions regarding security and reliability.
The aim of this article is to provide a practical overview: where and how can AI be used throughout the lifecycle of a platform solution, and which risks need to be considered?
Status Quo: How AI/GenAI Is Being Used
Artificial Intelligence – especially Generative AI – is fundamentally transforming the way companies operate. The McKinsey study The State of AI, based on responses from 1,491 executives across 101 countries, reveals that organizations are actively adapting to increasingly generate value through AI and GenAI. AI is being used more frequently, across a broader range of applications, and is being integrated ever more deeply into existing systems (Exhibits 8–10) [1].
At the same time, it becomes clear that companies must adapt their workflows and processes to fully harness the potential of these technologies. According to the report AI in the Workplace: A Report for 2025, 92 % of companies plan to increase their investments in AI. However, only a small number of executives consider their internal processes sufficiently prepared to unlock the full potential of these technologies [2]. Leaders face the challenge of using AI in meaningful ways – to support employees, and to enhance creativity and productivity. Employees, in turn, show openness toward the use of AI and are actively requesting training to integrate the technology effectively into their daily work. Nevertheless, many organizations lack the necessary expertise and structured training programs. This presents significant potential that can be unlocked through targeted initiatives.
Companies already using AI productively report increased profitability (Exhibit 12) and significant cost reductions (Exhibit 13), according to McKinsey [1]. At the same time, the automation of repetitive tasks enables a stronger focus on activities that cannot be automated – such as those in the areas of security, compliance, or strategic decision-making. However, the successful realization of this potential largely depends on the organizational and technological conditions within a company. The study The Competitive Advantage of Generative AI emphasizes that efficiency gains and productivity boosts can only be achieved if structures and processes are aligned with the requirements of AI technologies, and if technological foundations such as data availability, interfaces, and process integration are in place [3]. Work methods must often be fundamentally adapted due to AI-driven approaches, and new AI-related roles need to be established within companies [1].
Another central issue is the quality of AI-generated outputs. While large language models are already capable of delivering impressive performance in standardized tests – such as in mathematics or language processing – they still lack true reasoning ability and a deeper, nuanced understanding of complex interrelationships [2]. Transparency, control, and a solid understanding of how AI operates and decides are therefore essential – particularly in light of security concerns, regulatory requirements, and compliance with internal company policies.
How AI Can Be Effectively Applied in Platform Development and Operations
The development process of a platform solution typically goes through several phases – from conceptualization and architectural design to implementation, testing and quality assurance, infrastructure & deployment, operations, and finally continuous improvement. In each of these phases, Generative AI can be strategically used to boost efficiency and relieve teams of repetitive tasks. For example, architectural proposals can be generated automatically, recurring tasks in infrastructure management or test design can be automated, and creative input can be provided for code generation or documentation. This not only shortens development cycles and reduces costs, but also unlocks new innovation potential – such as through intelligent analysis of large data volumes or suggestions for alternative solution approaches.
To fully leverage the potential of Generative AI, its integration should be considered early in the platform architecture. Both technical aspects and organizational structures need to be addressed. One of the most challenging areas is data integration: data often exists in varying formats, quality levels, or systems. Before AI can be used productively, questions regarding data quality, availability, and governance must be resolved. This is especially important when dealing with personal or sensitive information, where data protection regulations and compliance requirements must be strictly observed.
In addition, clear rules must be defined for how to handle AI-generated content. Beyond legal risks such as data protection violations or liability issues, AI also poses technical and ethical risks – such as inaccurate suggestions, biases in training data, or a lack of transparency in decision-making. Responsible use therefore requires clear guidelines, ongoing evaluation, and consistent validation of AI outputs, all of which must be documented and made traceable. Even when content is generated by an AI, ultimate responsibility remains with the company or the individuals in charge.
Platform Solution Development Process
The goal of the conceptual phase in a platform solution is to develop a shared, lean, efficient, and maintainable overall concept. Even at this early stage, AI can assist in preparing information, structuring requirements, and formulating initial user stories. In architectural planning, AI supports tasks such as analysing system landscapes, identifying potential bottlenecks, or generating architecture diagrams. However, the early phases of a project require intensive communication and alignment with all relevant stakeholders. This includes assessing technical feasibility and defining concrete business goals. Architectural decisions, the evaluation of trade-offs (e.g., latency vs. cost vs. availability), integration into existing IT landscapes, security requirements, and legal constraints remain – at least for now – the responsibility of human experts. These decisions demand in-depth knowledge of governance, strategic goals, and company-specific frameworks. While AI can assist, it cannot lead.
During implementation and code generation, GenAI can significantly relieve development teams. AI-powered tools such as GitHub Copilot [5] or AWS CodeWhisperer [6] can generate code snippets, suggest refactorings, or draft basic documentation. In addition, specialized document management AIs like Mintlify [7] or Conga Composer [8] accelerate the creation and maintenance of technical documentation. GenAI shows especially strong potential in testing: it can identify edge cases, automatically generate unit and integration tests, or help design load tests [5, 9, 10]. For infrastructure and CI/CD (Continuous Integration/Continuous Delivery), AI can generate deployment scripts and optimize pipelines – by providing better configuration suggestions or automatically correcting common errors. Developers can move faster and automate repetitive tasks. However, adhering to company-specific standards and implementing complex business logic still require domain expertise and architectural understanding, which AI first needs to be trained on. AI can enhance and accelerate quality assurance, but it cannot replace it.
After the initial deployment, the platform enters its operations and continuous improvement phase. AI can provide support here as well – through usage data analysis, performance evaluations, or feature recommendations. However, this phase also highlights the importance of human traceability of past architectural and design decisions. When a platform has been heavily AI-assisted, missing context or undocumented automation can later lead to issues – such as in debugging, extensibility, or understanding complex dependencies.
Thus, AI can contribute to increased efficiency at nearly every step of the development process – especially in automation, documentation, reusability, and relief from repetitive tasks – and has become indispensable for cost-effective planning and implementation. Still, the role of the developer or platform architect remains essential. While AI often delivers a quick answer, it does not always provide the best solution. Particularly in the design of complex systems, AI lacks business context, an understanding of priorities, and a holistic view of system landscapes. Humans are still needed to integrate system components, enforce compliance and security requirements, and drive innovation. An AI can only perform as well as the data it's based on – data silos, limited system access, or a lack of process integration clearly define its boundaries.
Operating a Platform Solution
AI can be used in platform operations to optimize workflows, predict outages, and enhance security. This approach has increasingly become an integral part of operational IT processes under the term AIOps – Artificial Intelligence for IT Operations. AIOps refers, among other things, to AI-supported pattern recognition and analysis of telemetry data – logs, metrics, traces – with the goal of identifying significant events early, resolving incidents faster, and significantly reducing operational effort through intelligent processes.
In infrastructure projects and solutions, AIOps is particularly beneficial where large volumes of operational data are generated. Predictive analytics – such as forecasting resource needs based on traffic patterns and user behaviour, anomaly detection, and prioritization and filtering of incidents – are especially valuable. Predictive analytics can be used to realize cost savings in hardware resources or to ensure fewer outages and greater reliability. Anomaly detection can also identify irregularities in CPU, memory, or network behaviour before they develop into serious issues. At the same time, intelligent alerting and automated incident prioritization reduce the risk of “alert fatigue” among operations teams. [13, 14, 15]
Another area of application is automated security monitoring: behaviour-based models can detect suspicious activities more quickly than rule-based systems and propose or initiate appropriate countermeasures. This significantly improves resilience and responsiveness, especially in hybrid or multi-cloud infrastructures. As a result, metrics such as Mean-Time-to-Detect (MTTD) and Mean-Time-to-Resolution (MTTR) can be significantly improved. [13]
For platform operations, specialized AI systems can already take over a large part of the workload. Market-established providers like DataDog [14], AWS [16], and Dynatrace [17] have long integrated AIOps capabilities into their services. They offer enhanced anomaly detection and intelligent alerting systems. GitHub [18], through its “GitHub Advanced Security,” enables automated analysis of vulnerabilities in code and software libraries, supplemented by automatic security patches. For incident management, solutions such as PagerDuty [19] or OpsGenie [20] can be used to support escalation and communication during disruptions.
Despite these advantages, AIOps also presents challenges. Data quality is essential – only with well-prepared and comprehensively collected data from the entire operational system can AIOps deliver precise and reliable results. A lack of understanding of the underlying IT topology can also lead to faulty analyses or incorrect correlations. This is where IT teams are still needed – to accurately identify, assign, and interpret the relationships between components such as databases, services, hardware, and networks. [14]
A look into the future shows that with the rise of agent-based systems, even deeper automation will become possible. AI agents will be capable of orchestrating complex workflows among themselves, learning from one another, and continuously improving. In the long term, they could take over even more extensive parts of IT operations – including independent planning, coordination, and execution of technical measures. [4]
Risks and Challenges
The very high confidence displayed by AI/GenAI models, combined with often insufficient transparency and traceability of their results, poses significant risks. Many models present their predictions with excessive certainty, even when the results are incomplete, flawed, or based on insufficiently validated data. This often leads professionals and decision-makers to accept AI output without critical scrutiny, resulting in false assumptions that can lead to serious misjudgements in critical business processes. A lack of understanding of the underlying logic and the opaque nature of AI decision-making further exacerbates this issue.
Errors in the implementation of AI systems can also create security vulnerabilities, which attackers may exploit to gain access to sensitive data or disrupt the operation of entire platform solutions. Improper integration increases the risk that critical decisions may be made by an AI system without adequate human oversight – potentially causing system outages, data loss, or other serious consequences. Excessive dependence on AI can lead to a neglect of necessary human expertise and supervision, allowing errors to go unnoticed and have large-scale negative impacts.
At the same time, the progressive adoption of AI is fundamentally reshaping the role of technical professionals. The focus is shifting from purely technical-operational tasks such as automation, security, infrastructure, and monitoring toward activities like validating AI outputs, structuring processes, and maintaining strategic oversight. This demands the development of new competencies and continuous training to ensure that AI systems remain under human control and that their results can be reliably evaluated. Ultimately, it is essential to implement mechanisms that regularly review AI system outputs to ensure a proper balance between technological innovation and operational security.
Conclusion and Recommendations
Systems and applications that leverage Artificial Intelligence have established themselves as powerful tools for optimizing infrastructure operations. AI-supported analyses help identify performance bottlenecks, predict outages, and make automated scaling decisions. AI tools also offer valuable support in automating infrastructure processes, documentation, and test generation. Especially in the operational phase, specialized AI systems can significantly ease the workload of development and operations teams. However, a deep understanding of system architectures, the ability to balance cost and performance, and the strategic planning of long-term infrastructure decisions still require experienced professionals. These bring contextual awareness, adaptability, and a deep understanding of business requirements – qualities that AI has yet to fully replicate.
Instead, AI serves as a powerful assistant that complements human expertise, but does not yet replace it. Therefore, AI/GenAI must be consciously and responsibly integrated into the development process as a supporting tool. To fully realize AI’s potential, system design must be aligned early on with the requirements of AI-powered tools – covering access rights, context provision, and data infrastructure. At the same time, teams must be adequately trained: responsible use of AI, understanding its limitations, crafting effective prompts, and being able to professionally evaluate and interpret AI outputs are becoming key competencies in the age of AI.
Sources
[1] https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai
[2] https://www.mckinsey.com/capabilities/mckinsey-digital/our-insights/superagency-in-the-workplace-empowering-people-to-unlock-ais-full-potential-at-work
[3] https://www.mckinsey.com/capabilities/mckinsey-digital/our-insights/a-generative-ai-reset-rewiring-to-turn-potential-into-value-in-2024
[4] https://www.mckinsey.com/featured-insights/mckinsey-explainers/what-is-an-ai-agent
[5] https://github.com/features/copilot
[6] https://aws.amazon.com/de/blogs/machine-learning/introducing-amazon-codewhisperer-the-ml-powered-coding-companion/
[7] https://mintlify.com/
[8] https://conga.com/products/conga-composer
[9] https://www.diffblue.com/code-quality/
[10] https://www.testim.io/
[11] https://aws.amazon.com/de/q/
[12] https://aws.amazon.com/de/codeguru/
[13] https://www.ibm.com/de-de/topics/aiops
[14] https://www.datadoghq.com/knowledge-center/aiops/
[15] https://aws.amazon.com/de/guardduty/
[16] https://aws.amazon.com/de/ai/
[17] https://www.dynatrace.com/de/platform/
[18] https://github.com/security/advanced-security
[19] https://www.pagerduty.com/
[20] https://www.atlassian.com/software/opsgenie