on
Model First, AI Second: The F1RE Approach
Model First, AI Second: The F1RE Approach
⏱️ 13 minute read
Background
At F1RE, we specialize in model-driven software engineering (MDE). For organizations using this in their daily operations, the benefits are clear. To those new to this field, model-driven engineering may seem like a specialized approach. But, MDE has demonstrated its value through successful implementations across industries, from automotive to aerospace. Consider how the ESA successfully moved from documents to models.
In this article, we share our perspective on why MDE becomes even more crucial in the age of AI - not as a competitor to AI, but as a foundation that can help organizations get the most value from AI while maintaining control and reliability.
The Gen-AI Hype: A View from Both Sides
Here’s the elephant in the room - Building specialized MDE tools would be a hard sell today considering the generative AI hype. "Why don’t you just dump the domain knowledge in the form of some documents and ask ChatGPT to create what you need" would not be a surprising rebuttal. Everyone’s rushing to "throw AI at it" these days for a fast result. At F1RE we don’t think this is smart. This creates more problems than it solves and you would lose control over what is being done with what.
Having worked at the intersection of model-driven engineering and AI across both industry and academia, I’ve seen how these two approaches interact in practice. In industry, I’ve worked with domain-specific languages for safety-critical systems and most recently led projects developing AI-enabled medical chatbots. Meanwhile, in academia, I co-authored a recent study published in ACL 2024 that looked at how these AI models actually understand code, and the findings were eye-opening. Despite their seemingly impressive outputs, these models don’t really grasp the meaningful connections in code - they just recognize patterns between similar-looking pieces. It’s like someone who can complete sentences in a conversation by pattern matching, without actually understanding how the different parts of the sentence relate to each other. Even more surprisingly, we found that making these AI models bigger (even up to billions of parameters) actually made them worse at capturing these important relationships, not better.
I saw these same limitations play out with the medical chatbots. Despite promising initial results, we struggled to make the system’s answers consistently reliable, even with sophisticated retrieval techniques. Our experience matches what we’re seeing across the industry - developers using AI assistants are seeing 40% more bugs, and Gartner’s survey of 5,728 customers revealed that "while customer service leaders are eager to adopt AI, 64% of customers remain concerned about such an adoption."
These concerns extend far beyond individual experiences. In safety-critical systems, even small inconsistencies in AI outputs can have serious implications. The challenge isn’t just about fixing bugs - it’s about guaranteeing reliable behavior in situations where mistakes could be costly (like affecting patient safety). This widespread experience across the industry shows why we need a more disciplined approach to AI integration, especially in domains where reliability matters.
The challenge with current AI approaches
The fundamental issue isn’t just about bugs or customer concerns - it’s about four critical properties that any business-critical system must have: determinism, reliability, verifiability, and explainability. In the following subsections, we’ll examine each of these properties in detail, comparing how traditional systems and generative AI solutions measure up against them. For each property, we’ll provide clear definitions and concrete examples that illustrate why simply 'adding AI' to existing processes often creates more problems than it solves:
1. Determinism
A property that guarantees that a system always produces the same output for the same input. |
Traditional System | Generative AI |
---|---|
A banking transaction with specific inputs always follows the same processing path |
The same prompt about transaction processing might yield different responses each time |
Flight control software responds identically to given conditions |
Asking about flight parameters could generate varying suggestions |
2. Reliability
The ability to consistently perform according to specifications without failure. |
In professional cycling team management, where scheduling decisions impact athlete health and team performance:
Traditional System | Generative AI |
---|---|
Never schedules races violating UCI* regulations |
Might suggest non-existent races |
Enforces mandatory recovery periods between races |
Could schedule impossible race combinations |
Respects rider specialties and qualifications |
May create resource conflicts |
3. Verifiability
The ability to prove that a system’s behavior matches its specifications. |
Traditional System |
Generative AI |
System behavior can be mathematically proven to meet safety requirements through formal verification methods |
Cannot provide mathematical proof that outputs will always conform to specified requirements |
Automated testing can verify compliance with every single business rule in the specification |
Testing can only sample a tiny fraction of possible outputs - complete verification impossible |
Every possible decision scenario can be analyzed to prove the system will always stay within defined limits |
No way to verify that all possible situations will be handled according to business rules |
4. Explainability
The capacity to provide clear, consistent reasoning for decisions. |
In autonomous vehicle decision-making:
Traditional System | Generative AI |
---|---|
Vehicle slowed because: |
Different explanations each time: |
The Root of the Problem
These four properties - determinism, reliability, verifiability, and explainability - reveal systemic issues with current AI approaches. But why do these problems persist? The core challenge lies in generative AI’s fundamental design goal: to be flexible enough to handle any input and generate plausible-looking outputs across unlimited domains. This very flexibility makes it inherently difficult to constrain these systems within specific business rules or domain requirements. (While there are other applications of AI focused on specific, bounded tasks that have proven more reliable - which we’ll discuss later in this article.)
An interesting perspective on this comes from a Scientific American analysis that reframes how we think about AI’s inconsistencies. As the article points out,
what we commonly call AI "hallucinations" are more accurately described as "bullshitting" - the AI isn’t trying and failing to represent reality, it’s simply generating plausible-looking text without any regard for truth. To quote the article verbatim - "we can see … that nothing about the modeling ensures that the outputs accurately depict anything in the world". In philosophical terms, the bullshitter, unlike a liar, doesn’t care about the truth - they just aim to produce convincing output. |
Instead of addressing the fundamental issue of domain control, current approaches try to fix these shortcomings through various bolt-on solutions:
Approach |
Limitation |
Creating more detailed prompts |
Still can’t guarantee consistent interpretation, just more verbose instructions that may or may not be followed |
Fine-tuning models with domain data |
May learn patterns but not rules; expensive and time-consuming with no guarantees of consistent behavior |
Using RAG (Retrieval Augmented Generation) |
While it can access correct reference data, may still combine information incorrectly or make invalid logical leaps |
Retraining models |
Extremely expensive, time-consuming, and still lacks formal guarantees of rule compliance |
Key Insight: These approaches treat symptoms rather than the root cause - they attempt surface-level fixes rather than establishing a robust domain foundation first. They all share the fundamental weakness of trying to teach rules to a system designed for flexibility rather than consistency. |
The F1RE vision: Model-Driven Engineering First
Having seen how bolt-on approaches fail to address the fundamental challenges of AI integration, we propose a different path forward: Instead of teaching AI rules through prompts or documents, we encode constraints into the system’s structure itself using our three layer vision. Let’s examine this approach:
Step 1: Model Your Domain (The Domain layer)
Create a Domain-Specific Language (DSL) that precisely captures what matters in your domain - its concepts, rules, and relationships. |
Before looking at examples, let’s understand the three key elements of domain modeling:
-
Concepts: The "things" in your domain - like products in a store or patients in a hospital
-
Rules: The step-by-step procedures that must be followed - like "check ID before selling alcohol"
-
Constraints: The absolute boundaries that can’t be crossed - like "no one under 18 can buy alcohol"
Let’s see some high-level examples of how different industries could structure their domain knowledge using these elements:
Domain | Key Elements |
---|---|
Healthcare |
Concepts: Patient, Treatment, Medication |
Financial Trading |
Concepts: Trade, Portfolio, Risk Level |
Manufacturing |
Concepts: Product, Process, Quality Check |
Key Benefit: Invalid operations become impossible by design. Just as you can’t overdraw a savings account with proper banking software, you can’t violate domain rules in a well-modeled system. |
Step 2: Make Your Models Executable (The execution layer)
Transform models from documentation into active systems that enforce rules automatically. |
Think of this step as bringing your domain rules to life. Instead of having rules in manuals or documentation that people need to read and follow, we create systems that automatically enforce these rules. It’s like the difference between:
-
Having a sign that says "Do not enter if the room is full" (passive)
-
Having an automatic counter that locks the door when capacity is reached (active)
This automatic enforcement means people can’t accidentally break rules - the system simply won’t allow invalid operations. It’s similar to how ATMs won’t let you withdraw more money than you have in your account, no matter what buttons you press.
This transformation from static documentation to active enforcement represents a fundamental shift in how organizations handle domain rules. Here’s how it looks across industries:
Domain | Transformation |
---|---|
Automotive |
Traditional: Safety requirements in PDF manuals |
Aviation |
Traditional: Flight procedures in manuals |
Healthcare |
Traditional: Treatment protocols in guidelines |
Step 3: Integrate AI Strategically (The AI layer)
Choose specific AI technologies based on task requirements and necessary guarantees. |
While generative AI’s flexibility makes it challenging to control, there exists a spectrum of AI technologies that can provide reliable results when properly constrained. A striking example is DeepMind’s AlphaFold, which revolutionized protein structure prediction by focusing on a specific scientific challenge. This specialized AI system achieved in two years what biochemists had struggled with for 50 years, demonstrating how focused AI applications can deliver remarkable value. By combining such focused AI capabilities with model-driven engineering, we can get the best of both worlds: the reliability of deterministic domain models and the predictive power of AI.
While building the AI layer, we carefully add AI capabilities - but only after we have our domain rules firmly in place. Think of it like building a house:
-
First, you need a solid foundation (your domain model that defines what’s valid)
-
Then you build strong walls (executable rules that enforce constraints)
-
Finally, you add specialized AI tools for specific tasks (predictions, pattern recognition, and analysis)
This approach ensures AI operates within your domain’s boundaries. Instead of hoping AI will learn and follow your rules, you’re building a system where AI literally cannot break them. It’s like giving AI a coloring book with clear lines - it can be creative with colors, but can’t color outside the lines you’ve defined.
Different AI approaches offer distinct benefits and guarantees. Understanding these helps us choose the right tool for each task:
Specialized AI Systems
These are AI tools designed for specific, bounded tasks - from simple supervised learning techniques like decision tree learning that follow clear if-then rules (like qualifying customers for loans) to sophisticated deep learning models trained on specific tasks (like AlphaFold's protein structure prediction). Unlike generative AI which aims to be a jack-of-all-trades, these systems are a good fit for tasks requiring predictable, verifiable results because they guarantee the following properties:
-
Predictable Behavior: Like a calculator, gives the same output for the same input every time
-
Clear Confidence Levels: Tells you exactly how sure it is about predictions (e.g., "85% confident this patient needs urgent care" with mathematical proof)
-
Traceable Decisions: Can explain exactly why it made each prediction (e.g., "40% of the risk score came from blood pressure, 30% from age…")
-
Direct Domain Mapping: Works directly with your domain rules (e.g., if your medical system has 20 vital signs to check, specialized AI models work with exactly those 20 measurements - no more, no less) { For more on this, read about extracting features from raw domain data}
Here are some examples of tasks where specialized AI technologies can be useful.
Domain | Application Examples |
---|---|
Medical Diagnosis |
Predict patient risks with exact confidence levels and clear evidence of which symptoms led to the prediction |
Financial Risk |
Calculate precise risk scores where you can trace every factor that influenced the score |
Manufacturing |
Predict equipment failures with specific timeframes and clear reasoning about which measurements indicated problems |
Generative AI
For creative tasks where flexibility matters more than absolute precision, GenAI is handy (but needs to be kept on a tight leash) because it has the following key characteristics:
-
Creative Flexibility: Can generate new ideas and content, but may give different outputs each time
-
Broader Knowledge: Can suggest new approaches based on general knowledge
-
Needs Validation: Like a creative assistant, its suggestions must always be checked against domain rules
-
Indirect Domain Mapping: Understands general concepts but needs guidance to follow specific domain rules
Here are some examples of tasks where generative AI technologies can be useful.
Domain | Application Examples |
---|---|
Legal |
Draft initial contracts which lawyers then review and correct |
Software |
Suggest code that must be verified against your system’s rules |
Marketing |
Create content drafts that experts review for accuracy and brand alignment |
Key Insight: Specialized AI is like having a highly specialized expert who always follows your rules and can explain their reasoning. Generative AI is like having a creative assistant who brings new ideas but needs supervision to ensure they follow your domain rules correctly. Both are valuable when used appropriately and constrained by your domain model. |
How It All Works Together
While we’ve laid out our vision for integrating AI with domain models, it’s important to understand how this could work in practice. Let’s examine a proposed flow that demonstrates how our model-first approach could ensure reliable AI integration through automated domain validation.
In this diagram, rectangles represent system components (User, GenAI, Domain Layer, and Specialized AI), solid arrows show direct commands or requests, and dotted arrows indicate responses or conditional flows that depend on validation results. Every interaction flows through the Domain Layer, ensuring all operations respect our defined rules and constraints.
Let’s explore each pathway with a concrete example:
- Draft Creation with GenAI
-
Imagine drafting a manufacturing process document:
-
User requests GenAI to create an initial process draft
-
GenAI generates content following provided guidelines
-
Domain Layer validates that all safety steps are included and in correct order
-
User receives a compliant draft ready for review
-
- Manual Artifact Creation
-
Consider creating a new trading rule:
-
User directly inputs trading parameters into the system
-
Domain Layer checks against position limits and regulatory requirements
-
User receives immediate feedback if any constraints are violated
-
Valid rules are accepted into the system
-
- Intelligent Analysis
-
Take predictive maintenance in manufacturing:
-
User requests analysis of equipment performance data
-
Specialized AI models access historical data within domain constraints
-
Domain Layer ensures predictions respect operating limits
-
User receives validated predictions about potential equipment failures
-
Each pathway demonstrates how the Domain Layer acts as a guardian, ensuring all operations - whether human-initiated, AI-generated, or AI-analyzed - remain within established business rules and constraints.
Key Insight: This model-first approach shows how different AI types serve distinct purposes: GenAI as an optional drafting aid and specialized AI as a core prediction engine. Both are constrained by domain models, ensuring system reliability regardless of which components are used. |
In summary, Build reliable systems first, then innovate with AI.
About Krishna Narasimhan
I am a Model Driven Software Engineer at F1RE, since September 2024. With a background spanning both industry and academia, I enjoy building robust software engineering tools and domain-specific languages (DSLs). My work has focused on improving software quality and developer productivity through better tooling and automated analysis techniques.
Prior to F1RE, I have worked on projects building software systems across different domains, including safety-critical applications. Most recently I led projects exploring the use of AI systems in safety-critical domains like medicine. In academia, my research concentrated on static analysis, language design and most recently in the area of AI for coding tasks. I’ve particularly enjoyed bridging the gap between academic insights and practical industry applications, having built and deployed DSLs in both settings.
Throughout my career, I’ve maintained a strong focus on systematic and verifiable approaches to software development, emphasizing robustness and reliability.
You can contact me at krishna@f1re.io.