Model First, AI Second: The F1RE Approach

Model First, AI Second: The F1RE Approach

⏱️ 13 minute read

Background

At F1RE, we specialize in model-driven software engineering (MDE). For organizations using this in their daily operations, the benefits are clear. To those new to this field, model-driven engineering may seem like a specialized approach. But, MDE has demonstrated its value through successful implementations across industries, from automotive to aerospace. Consider how the ESA successfully moved from documents to models.

In this article, we share our perspective on why MDE becomes even more crucial in the age of AI - not as a competitor to AI, but as a foundation that can help organizations get the most value from AI while maintaining control and reliability.

The Gen-AI Hype: A View from Both Sides

Here’s the elephant in the room - Building specialized MDE tools would be a hard sell today considering the generative AI hype. "Why don’t you just dump the domain knowledge in the form of some documents and ask ChatGPT to create what you need" would not be a surprising rebuttal. Everyone’s rushing to "throw AI at it" these days for a fast result. At F1RE we don’t think this is smart. This creates more problems than it solves and you would lose control over what is being done with what.

Having worked at the intersection of model-driven engineering and AI across both industry and academia, I’ve seen how these two approaches interact in practice. In industry, I’ve worked with domain-specific languages for safety-critical systems and most recently led projects developing AI-enabled medical chatbots. Meanwhile, in academia, I co-authored a recent study published in ACL 2024 that looked at how these AI models actually understand code, and the findings were eye-opening. Despite their seemingly impressive outputs, these models don’t really grasp the meaningful connections in code - they just recognize patterns between similar-looking pieces. It’s like someone who can complete sentences in a conversation by pattern matching, without actually understanding how the different parts of the sentence relate to each other. Even more surprisingly, we found that making these AI models bigger (even up to billions of parameters) actually made them worse at capturing these important relationships, not better.

I saw these same limitations play out with the medical chatbots. Despite promising initial results, we struggled to make the system’s answers consistently reliable, even with sophisticated retrieval techniques. Our experience matches what we’re seeing across the industry - developers using AI assistants are seeing 40% more bugs, and Gartner’s survey of 5,728 customers revealed that "while customer service leaders are eager to adopt AI, 64% of customers remain concerned about such an adoption."

These concerns extend far beyond individual experiences. In safety-critical systems, even small inconsistencies in AI outputs can have serious implications. The challenge isn’t just about fixing bugs - it’s about guaranteeing reliable behavior in situations where mistakes could be costly (like affecting patient safety). This widespread experience across the industry shows why we need a more disciplined approach to AI integration, especially in domains where reliability matters.

The challenge with current AI approaches

The fundamental issue isn’t just about bugs or customer concerns - it’s about four critical properties that any business-critical system must have: determinism, reliability, verifiability, and explainability. In the following subsections, we’ll examine each of these properties in detail, comparing how traditional systems and generative AI solutions measure up against them. For each property, we’ll provide clear definitions and concrete examples that illustrate why simply 'adding AI' to existing processes often creates more problems than it solves:

1. Determinism

A property that guarantees that a system always produces the same output for the same input.

Traditional System Generative AI

A banking transaction with specific inputs always follows the same processing path

The same prompt about transaction processing might yield different responses each time

Flight control software responds identically to given conditions

Asking about flight parameters could generate varying suggestions

2. Reliability

The ability to consistently perform according to specifications without failure.

In professional cycling team management, where scheduling decisions impact athlete health and team performance:

Traditional System Generative AI

Never schedules races violating UCI* regulations
* International Cycling Federation, the governing body of major cycling events

Might suggest non-existent races

Enforces mandatory recovery periods between races

Could schedule impossible race combinations

Respects rider specialties and qualifications

May create resource conflicts

3. Verifiability

The ability to prove that a system’s behavior matches its specifications.

Traditional System

Generative AI

System behavior can be mathematically proven to meet safety requirements through formal verification methods

Cannot provide mathematical proof that outputs will always conform to specified requirements

Automated testing can verify compliance with every single business rule in the specification

Testing can only sample a tiny fraction of possible outputs - complete verification impossible

Every possible decision scenario can be analyzed to prove the system will always stay within defined limits

No way to verify that all possible situations will be handled according to business rules

4. Explainability

The capacity to provide clear, consistent reasoning for decisions.

In autonomous vehicle decision-making:

Traditional System Generative AI

Vehicle slowed because:
Pedestrian detected at 15m distance
Road surface friction reduced by 40%
Safety protocol 7.2 activated

Different explanations each time:
"Considering traffic patterns…​"
"Based on movement in the area…​"
"Looking at environmental conditions…​"

The Root of the Problem

These four properties - determinism, reliability, verifiability, and explainability - reveal systemic issues with current AI approaches. But why do these problems persist? The core challenge lies in generative AI’s fundamental design goal: to be flexible enough to handle any input and generate plausible-looking outputs across unlimited domains. This very flexibility makes it inherently difficult to constrain these systems within specific business rules or domain requirements. (While there are other applications of AI focused on specific, bounded tasks that have proven more reliable - which we’ll discuss later in this article.)

An interesting perspective on this comes from a Scientific American analysis that reframes how we think about AI’s inconsistencies. As the article points out,

what we commonly call AI "hallucinations" are more accurately described as "bullshitting" - the AI isn’t trying and failing to represent reality, it’s simply generating plausible-looking text without any regard for truth. To quote the article verbatim - "we can see …​ that nothing about the modeling ensures that the outputs accurately depict anything in the world". In philosophical terms, the bullshitter, unlike a liar, doesn’t care about the truth - they just aim to produce convincing output.

Instead of addressing the fundamental issue of domain control, current approaches try to fix these shortcomings through various bolt-on solutions:

Approach

Limitation

Creating more detailed prompts

Still can’t guarantee consistent interpretation, just more verbose instructions that may or may not be followed

Fine-tuning models with domain data

May learn patterns but not rules; expensive and time-consuming with no guarantees of consistent behavior

Using RAG (Retrieval Augmented Generation)

While it can access correct reference data, may still combine information incorrectly or make invalid logical leaps

Retraining models

Extremely expensive, time-consuming, and still lacks formal guarantees of rule compliance

Key Insight: These approaches treat symptoms rather than the root cause - they attempt surface-level fixes rather than establishing a robust domain foundation first. They all share the fundamental weakness of trying to teach rules to a system designed for flexibility rather than consistency.

The F1RE vision: Model-Driven Engineering First

Having seen how bolt-on approaches fail to address the fundamental challenges of AI integration, we propose a different path forward: Instead of teaching AI rules through prompts or documents, we encode constraints into the system’s structure itself using our three layer vision. Let’s examine this approach:

Step 1: Model Your Domain (The Domain layer)

Create a Domain-Specific Language (DSL) that precisely captures what matters in your domain - its concepts, rules, and relationships.

Before looking at examples, let’s understand the three key elements of domain modeling:

  • Concepts: The "things" in your domain - like products in a store or patients in a hospital

  • Rules: The step-by-step procedures that must be followed - like "check ID before selling alcohol"

  • Constraints: The absolute boundaries that can’t be crossed - like "no one under 18 can buy alcohol"

Let’s see some high-level examples of how different industries could structure their domain knowledge using these elements:

Domain Key Elements

Healthcare

Concepts: Patient, Treatment, Medication
Rules: If patient has diabetes, check blood sugar before treatment
Constraints: Only certified doctors can prescribe Class A drugs

Financial Trading

Concepts: Trade, Portfolio, Risk Level
Rules: Calculate risk exposure before each trade
Constraints: No single trade can exceed 5% of portfolio value

Manufacturing

Concepts: Product, Process, Quality Check
Rules: Follow specified assembly sequence for each product
Constraints: Machine operating temperature must stay below 85°C

Key Benefit: Invalid operations become impossible by design. Just as you can’t overdraw a savings account with proper banking software, you can’t violate domain rules in a well-modeled system.

Step 2: Make Your Models Executable (The execution layer)

Transform models from documentation into active systems that enforce rules automatically.

Think of this step as bringing your domain rules to life. Instead of having rules in manuals or documentation that people need to read and follow, we create systems that automatically enforce these rules. It’s like the difference between:

  • Having a sign that says "Do not enter if the room is full" (passive)

  • Having an automatic counter that locks the door when capacity is reached (active)

This automatic enforcement means people can’t accidentally break rules - the system simply won’t allow invalid operations. It’s similar to how ATMs won’t let you withdraw more money than you have in your account, no matter what buttons you press.

This transformation from static documentation to active enforcement represents a fundamental shift in how organizations handle domain rules. Here’s how it looks across industries:

Domain Transformation

Automotive

Traditional: Safety requirements in PDF manuals
Executable: Automated validation of every design change against safety rules

Aviation

Traditional: Flight procedures in manuals
Executable: System automatically validates flight plans against weather, fuel, and crew constraints

Healthcare

Traditional: Treatment protocols in guidelines
Executable: System enforces protocol compliance in real-time

Step 3: Integrate AI Strategically (The AI layer)

Choose specific AI technologies based on task requirements and necessary guarantees.

While generative AI’s flexibility makes it challenging to control, there exists a spectrum of AI technologies that can provide reliable results when properly constrained. A striking example is DeepMind’s AlphaFold, which revolutionized protein structure prediction by focusing on a specific scientific challenge. This specialized AI system achieved in two years what biochemists had struggled with for 50 years, demonstrating how focused AI applications can deliver remarkable value. By combining such focused AI capabilities with model-driven engineering, we can get the best of both worlds: the reliability of deterministic domain models and the predictive power of AI.

While building the AI layer, we carefully add AI capabilities - but only after we have our domain rules firmly in place. Think of it like building a house:

  • First, you need a solid foundation (your domain model that defines what’s valid)

  • Then you build strong walls (executable rules that enforce constraints)

  • Finally, you add specialized AI tools for specific tasks (predictions, pattern recognition, and analysis)

This approach ensures AI operates within your domain’s boundaries. Instead of hoping AI will learn and follow your rules, you’re building a system where AI literally cannot break them. It’s like giving AI a coloring book with clear lines - it can be creative with colors, but can’t color outside the lines you’ve defined.

Different AI approaches offer distinct benefits and guarantees. Understanding these helps us choose the right tool for each task:

Specialized AI Systems

These are AI tools designed for specific, bounded tasks - from simple supervised learning techniques like decision tree learning that follow clear if-then rules (like qualifying customers for loans) to sophisticated deep learning models trained on specific tasks (like AlphaFold's protein structure prediction). Unlike generative AI which aims to be a jack-of-all-trades, these systems are a good fit for tasks requiring predictable, verifiable results because they guarantee the following properties:

  • Predictable Behavior: Like a calculator, gives the same output for the same input every time

  • Clear Confidence Levels: Tells you exactly how sure it is about predictions (e.g., "85% confident this patient needs urgent care" with mathematical proof)

  • Traceable Decisions: Can explain exactly why it made each prediction (e.g., "40% of the risk score came from blood pressure, 30% from age…​")

  • Direct Domain Mapping: Works directly with your domain rules (e.g., if your medical system has 20 vital signs to check, specialized AI models work with exactly those 20 measurements - no more, no less) { For more on this, read about extracting features from raw domain data}

Here are some examples of tasks where specialized AI technologies can be useful.

Domain Application Examples

Medical Diagnosis

Predict patient risks with exact confidence levels and clear evidence of which symptoms led to the prediction

Financial Risk

Calculate precise risk scores where you can trace every factor that influenced the score

Manufacturing

Predict equipment failures with specific timeframes and clear reasoning about which measurements indicated problems

Generative AI

For creative tasks where flexibility matters more than absolute precision, GenAI is handy (but needs to be kept on a tight leash) because it has the following key characteristics:

  • Creative Flexibility: Can generate new ideas and content, but may give different outputs each time

  • Broader Knowledge: Can suggest new approaches based on general knowledge

  • Needs Validation: Like a creative assistant, its suggestions must always be checked against domain rules

  • Indirect Domain Mapping: Understands general concepts but needs guidance to follow specific domain rules

Here are some examples of tasks where generative AI technologies can be useful.

Domain Application Examples

Legal

Draft initial contracts which lawyers then review and correct

Software

Suggest code that must be verified against your system’s rules

Marketing

Create content drafts that experts review for accuracy and brand alignment

Key Insight: Specialized AI is like having a highly specialized expert who always follows your rules and can explain their reasoning. Generative AI is like having a creative assistant who brings new ideas but needs supervision to ensure they follow your domain rules correctly. Both are valuable when used appropriately and constrained by your domain model.

How It All Works Together

While we’ve laid out our vision for integrating AI with domain models, it’s important to understand how this could work in practice. Let’s examine a proposed flow that demonstrates how our model-first approach could ensure reliable AI integration through automated domain validation.

System Flow
Figure 1. System Flow

In this diagram, rectangles represent system components (User, GenAI, Domain Layer, and Specialized AI), solid arrows show direct commands or requests, and dotted arrows indicate responses or conditional flows that depend on validation results. Every interaction flows through the Domain Layer, ensuring all operations respect our defined rules and constraints.

Let’s explore each pathway with a concrete example:

Draft Creation with GenAI

Imagine drafting a manufacturing process document:

  • User requests GenAI to create an initial process draft

  • GenAI generates content following provided guidelines

  • Domain Layer validates that all safety steps are included and in correct order

  • User receives a compliant draft ready for review

Manual Artifact Creation

Consider creating a new trading rule:

  • User directly inputs trading parameters into the system

  • Domain Layer checks against position limits and regulatory requirements

  • User receives immediate feedback if any constraints are violated

  • Valid rules are accepted into the system

Intelligent Analysis

Take predictive maintenance in manufacturing:

  • User requests analysis of equipment performance data

  • Specialized AI models access historical data within domain constraints

  • Domain Layer ensures predictions respect operating limits

  • User receives validated predictions about potential equipment failures

Each pathway demonstrates how the Domain Layer acts as a guardian, ensuring all operations - whether human-initiated, AI-generated, or AI-analyzed - remain within established business rules and constraints.

Key Insight: This model-first approach shows how different AI types serve distinct purposes: GenAI as an optional drafting aid and specialized AI as a core prediction engine. Both are constrained by domain models, ensuring system reliability regardless of which components are used.

In summary, Build reliable systems first, then innovate with AI.

About Krishna Narasimhan

I am a Model Driven Software Engineer at F1RE, since September 2024. With a background spanning both industry and academia, I enjoy building robust software engineering tools and domain-specific languages (DSLs). My work has focused on improving software quality and developer productivity through better tooling and automated analysis techniques.

Prior to F1RE, I have worked on projects building software systems across different domains, including safety-critical applications. Most recently I led projects exploring the use of AI systems in safety-critical domains like medicine. In academia, my research concentrated on static analysis, language design and most recently in the area of AI for coding tasks. I’ve particularly enjoyed bridging the gap between academic insights and practical industry applications, having built and deployed DSLs in both settings.

Throughout my career, I’ve maintained a strong focus on systematic and verifiable approaches to software development, emphasizing robustness and reliability.

You can contact me at krishna@f1re.io.