A legal minefield: Open source licensing for AI models
AI models differ from traditional software. Learn how new licensing frameworks aim to standardize what "open" means in AI, including the Linux Foundation's OpenMDW 1.1.
With an open source license, users can see the actual application code and can build and run applications freely. AI throws a wrench in this paradigm.
Unlike a traditional software application, an AI model has its own distinct components and is built in a fundamentally different way. Yet despite those differences, developers have been attempting to apply the same open source licenses to AI models. This approach is less than ideal.
To make matters even more confusing, many people use the phrase open weight model to refer to any publicly released or self-hostable AI model, regardless of license.
"That combination of lack of familiarity, an emphasis on restriction and unclear legal drafting introduces a degree of legal risk that does not exist when a model is released under a real open source license," Richard Fontana, principal commercial counsel at Red Hat, said.
It's a challenge that many have tried to solve in recent years. The latest attempt debuted on May 28, 2026, when the Linux Foundation released the OpenMDW 1.1 license framework. On June 2, the G7 Digital and Technology Ministers published a framework calling for shared language around AI openness and explicitly rejecting the binary open/closed classification.
Why AI model licensing is not like software licensing
The problem starts with a fundamental mismatch. Open source licensing was built for software. AI models are not software in the traditional sense. Applying the same legal frameworks to them leaves significant gaps.
Software licensing works because copyright is well understood. When a developer publishes code under Apache 2.0 or MIT, those licenses grant rights tied to a copyrightable work. Enterprise legal teams know how to evaluate them.
"For the more permissive open source licenses at least, it is fairly straightforward to map those license terms to the use of models," Fontana said.
AI models are a different story. A trained model contains weights and parameters, training data, deployment code and documentation, each with a different legal character. In fact, the weights might not be copyrightable at all.
A trained model contains weights and parameters, training data, deployment code and documentation, each with a different legal character.
"If you look at a weights and parameters file, it just looks like a stream of data with a bunch of commas in it," Michael Dolan, SVP of legal and strategic programs at the Linux Foundation, said. "It's basically data at that point. There's not copyrightable works in there."
Apply a copyright-based license to that, and recipients are left uncertain about what they can actually do with the model.
"It's not clear to people what you can do with it, if you have the rights that you need to do the things you want to do," Dolan said.
The open-source AI licensing landscape
Several efforts have attempted to address the software-model licensing gap.
Apache 2.0 applied to models.
The Apache 2.0 license is one of the most widely-used open source licenses for software. It is often the default for open AI approaches.
"Apache 2.0 is one good license for AI model weights and AI-related software because it's an open source license that is widely used and widely understood in both the software and AI model settings, but it is not the only one," Fontana said.
The larger problem is publishers claiming the Apache 2.0 label while keeping weights, training data or training code private, a practice the G7 framework calls "open washing."
Bespoke pseudo-open licenses
Meta's Llama license restricts commercial use above certain user thresholds, prohibits using Llama to train competing models and includes geographic and use-case restrictions. Early DeepSeek releases used a custom license with use-based restrictions layered on top of permissive terms. More recent models have moved to MIT.
Pseudo-open licenses create direct compliance risk for enterprise legal teams. "The licenses are generally bespoke, unfamiliar licenses, and they feature an often elaborate set of use restrictions that may be well intended in some cases but are difficult to interpret," Fontana said.
Pseudo-open licenses create direct compliance risk for enterprise legal teams.
The OSI open source AI definition (OSAID)
Finalized in late 2024, the OSAID requires that a model cannot be called open source unless sufficient information is available to understand how it was built. The standard initially pushed for training data availability, but the final version settled for requiring detailed documentation about training data where the data itself cannot be shared.
"The data that people are using to build these models are oftentimes data they can't share, so you're just basically not even allowing open distribution if you adhere to that strict of a definition," Dolan said.
Fontana observes a consensus among the open source community about what the minimum bar should be: weights must be released under a genuine open source license. Beyond that, "there continue to be strong arguments that training data is the AI analogue of source code," he said.
The G7 four-tier typology
Published in June 2026, the G7 framework replaces the binary open/closed distinction with a spectrum.
Stephen O'Grady of RedMonk surveyed 68 models in May 2026 and found that, of 40 non-closed models, none qualified as Open Source AI or Open Source AI with Open Data. Half fell into Open Weights AI and half into Weights Available AI. The G7 framework effectively proposes "open weights" as the practical term of art for enterprise AI work.
What makes OpenMDW different
Rather than defining what must be shared to qualify as "open," OpenMDW takes a different approach entirely: it defines what rights are conveyed for whatever the publisher chooses to share. OpenMDW, which stands for Open Model, Data and Weights, was developed by the Linux Foundation together with Amazon, Meta, IBM, Microsoft and others. It grew out of work on the Model Openness Framework, which catalogued the different types of artifacts in a model package and mapped the legal frameworks that apply to each. The goal was one license file that covers everything a publisher puts out.
It covers rights that Apache 2.0 does not
OpenMDW grants copyright, patent, database and trade secret rights in a single license file, covering the full range of IP rights that might apply to a model package. A publisher does not need separate licenses for software components, data and documentation.
"It was meant to be all-encompassing of whatever you publish under it," Dolan said.
It is permissive, not copyleft
There is no requirement to publish modifications or contribute changes back. The license travels with distributed materials but imposes no obligations beyond attribution. Trademark rights are excluded.
Nvidia's adoption is the strongest signal yet
Nvidia previously used custom license language for its model releases. Moving to OpenMDW puts them on a standard framework.
"I think they want to get out to a structure where you create a model, you put an open model license on it, it's a simple one that everybody can understand, so that they don't have to respond to all the inquiries from everybody using it," Dolan said.
One provision might prove controversial
OpenMDW terminates a recipient's rights if they file or participate in a lawsuit asserting that the model materials infringe any patent or copyright. Counterparts exist in some open source licenses, but those typically cover only patent assertions.
"Some community members may see inclusion of copyright here as going too far, given the concerns within the community over what some see as misappropriation of open source software used as training data," Fontana said. "While OpenMDW is attractive as a sort of all-purpose license for model materials, I don't see a compelling reason to choose it over a standard open source license such as Apache-2.0. At this time, I am not aware of any Red Hat work that intends to use OpenMDW."
Comparing the approaches
The table below maps each major licensing approach against the key criteria IT leaders and developers need to evaluate.
What IT leaders should do now
The open source AI landscape can be confusing, but it is manageable. Consider the following steps:
Audit your model inventory. Catalog the open-weight models in your environment and record the license on each. Separate recognized licenses from bespoke terms. Flag anything the legal team has not reviewed.
Apply the G7 four-tier classification. Map each model against the typology. A model described as "open source" that is actually Weights Available AI might carry use restrictions that conflict with your deployment plans.
Know what "open source" means. Red Hat's position is a reasonable baseline for evaluating any model claim. "Don't call something 'open source AI' unless the model weights are released under an open source license and any software itself qualifies as open source based on the license and availability of source code," Fontana said.
Check the terms before you commit. Require publishers to specify which components are included and which license governs each. Any bespoke license requires legal review before deployment.
"What do you want people to do with your model?" Dolan said. "Put those two things together and ask what license gives the terms that make it clear."
Sean Michael Kerner is an IT consultant, technology enthusiast and tinkerer. He has pulled Token Ring, configured NetWare and been known to compile his own Linux kernel. He consults with industry and media organizations on technology issues.