Establishing a Clear Policy on AI-Assisted Contributions

Hi everyone,

This issue is to open a discussion on a crucial topic for the future of [our project(s)](#Applicable-To):

**The use of AI-powered coding assistants (like [Anthropic Claude](https://www.anthropic.com/claude), [Google Gemini](https://gemini.google.com/app), [GitHub Copilot](https://github.com/features/copilot) etc.) by project contributors.**

Here is the problem:

![Image](https://github.com/user-attachments/assets/df14492b-2d6e-4f73-ace5-48c39ebd890c)

which leads to:

![Image](https://github.com/user-attachments/assets/4f73cdf8-ff95-420f-9436-241d5ac5eb9e)

------

The goal here is to gather opinions, have an open discussion, figure out and agree on how we move forward with all of that, as an Open-source project, and ultimately crystallize and embody into an AI policy and more, please see [below deliverables](#Deliverables).

**Your Input is Crucial**

This policy will shape how we work going forward in this new world unfolding. Please share:

- Your experiences using AI coding assistants
- Concerns about the proposed approach
- Suggestions for making the policy both protective and practical
- Examples of good/bad AI usage you've encountered

Let's work together to create a policy that protects our project while remaining welcoming to contributors who use modern tools responsibly!

Ah, and sorry for the "long" issue, I took quite some time and care to collect material and thoughts exhaustively (my understanding / eyes / views / opinions) so we have some meat to chew and discuss;)

----

## Applicable To

This issue is relevant to all of these projects, all [WAMP](https://wamp-proto.org/) related:

* [WAMP](https://github.com/wamp-proto/wamp-proto/): The Web Application Messaging Protocol (the protocol specification and website)
* [txaio](https://github.com/crossbario/txaio/): txaio is a helper library for writing code that runs unmodified on both Twisted and asyncio / Trollius.
* [Autobahn|Python](https://github.com/crossbario/autobahn-python/): WebSocket & WAMP for Python on Twisted and asyncio.
* [Autobahn|JS](https://github.com/crossbario/autobahn-js): WAMP for Browsers and NodeJS.
* [Autobahn|Java](https://github.com/crossbario/autobahn-java): WebSocket & WAMP in Java for Android and Java 8
* [Autobahn|C++](https://github.com/crossbario/autobahn-cpp): WAMP for C++ in Boost/Asio
* [Autobahn|Testsuite](https://github.com/crossbario/autobahn-testsuite): The Autobahn|Testsuite provides a fully automated test suite to verify client and server implementations of The WebSocket Protocol (and WAMP) for specification conformance and implementation robustness.
* [Crossbar.io](https://github.com/crossbario/crossbar): Crossbar.io is an open source networking platform for distributed and microservice applications. It implements the open Web Application Messaging Protocol (WAMP)
* [zLMDB](https://github.com/crossbario/zlmdb): Object-relational in-memory database layer based on LMDB
* [cfxdb](https://github.com/crossbario/cfxdb): cfxdb is a Crossbar.io Python support package with core database access classes written in native Python.

**Rather than filing one issue on each of above 10 repositories, I've decided it makes more sense - for the discussion - to happen in one repository only, the one with the most GitHub stars - which is [Autobahn|Python](https://github.com/crossbario/autobahn-python/). But if and once it concludes, I would file the other corresponding 9 issues on the respective repos, promise.**

> Sidenote 1: Collecting all of this, I just realize, a) how crazy this whole endevour (WAMP etc) has turned out to be, and b) how much we have achieved with all of you contributing (OSS, oh yeah!), and c) that I *am* crazy! Did I mention already? Well, it's true;)

> Sitenote 2: Personally, I have lately done quite some experimentation with "AI" is various ways and for various uses, and I *am* quite thrilled and optimistic that AI can indeed help us tame above beast! At least, for me, for hacking, coding and all that, it is an incredible catalyst / accelerator, time saver, and time is of the essence, always "too little" and all. Which is part of the reason I am filing this issue.

## Deliverables

- [x] `AI_POLICY.rst`: **Human contributor/developer/user** addressed AI policy and guideline
- [x] `CLAUDE.md`: **AI assistant/agent** addressed AI policy and guideline; *also* technical matters (code formatting, GitHub workflow, documentation, test strategy, ..)
- [x] `README.rst`: single paragraph ("IMPORTANT") pointing to above
- [x] [GitHub Issue, PR and Commit Templates](https://docs.github.com/en/communities/using-templates-to-encourage-useful-issues-and-pull-requests/configuring-issue-templates-for-your-repository) including AI matters

## Meta-Goal: Making the Right Thing the Easy Thing

Before diving into the details, let's be clear about our philosophy. We've all seen compliance processes fail because they create overhead without value. Our goal is different:

**We want to create a process that:**
- ✅ Actually helps developers write better code and documentation
- ✅ Makes collaboration more transparent and effective
- ✅ Integrates seamlessly into existing workflows
- ✅ Creates legal protection as a natural byproduct, not as bureaucratic overhead

**We explicitly reject:**
- ❌ Compliance theater that wastes developer time
- ❌ Processes that exist only to check boxes
- ❌ Training videos, quizzes, or attestation forms
- ❌ Anything that makes contributing harder without making it better

The principle is simple: **If following the process makes developers' work better, they'll actually follow it.**

## Our Intent: Responsible Innovation

As AI tools become more powerful and integrated into our workflows, it's vital that we proactively establish a clear policy to:

1. **Protect the legal integrity** of our codebase
2. **Respect our licensing commitments** to users and contributors
3. **Enable responsible use** of productivity-enhancing AI tools
4. **Create transparent documentation** of our development practices
5. **Lead by example** in the open source community

This affects all our projects, from the dual-licensed **Crossbar.io** to the permissively-licensed **Autobahn|XYZ** family.

### The Core Challenge: AI, Authorship, and Copyright

The central issue stems from a fundamental principle in copyright law (e.g., as interpreted by the U.S. Copyright Office):

> **A work must be created by a human to be copyrightable.** An AI cannot be an author and cannot hold copyright.

This has several critical consequences for us:

1.  **The Ownership Gap:** Code generated by an AI without significant human creative input or modification is not owned by the user who prompted it. It may fall into the **public domain**.

2.  **The "Union License" Problem:** AI models are trained on vast datasets containing code under various licenses (MIT, GPL, Apache, proprietary, etc.). If AI output is considered a derivative work of its training data, the legal implications are staggering:
    - The output could carry obligations from ALL licenses in the training set (L₁ ∪ L₂ ∪ ... ∪ Lₙ)
    - If any two licenses are incompatible (e.g., GPL-2.0 vs GPL-3.0), the output may be **legally unusable**
    - Even if all licenses are compatible, determining and complying with this "union license" is practically impossible

3.  **The "Derivative Work" Interpretation Chaos:** The term "derivative work" itself is a legal minefield:
    - The U.S. Copyright Office has one interpretation
    - The FSF has another (particularly relevant for GPL)
    - The Linux Foundation might have yet another view
    - **Ultimately, a specific court in a specific jurisdiction will decide** - and different courts may rule differently
    - Penalties could include statutory damages, and willful infringement can result in enhanced damages

### How This Impacts Our Projects

The risks differ depending on the project's license, but they are significant in all cases.

#### For Permissively-Licensed Projects (e.g., Autobahn|XYZ - MIT License)

*   **The Problem:** The primary risk is **license pollution**. A contributor might unknowingly submit AI-generated code that is a derivative of GPL-licensed training data.
*   **The Result:** Our MIT-licensed project could inadvertently contain code with copyleft obligations. This creates a serious compliance problem for downstream users who build proprietary products on top of our libraries, as they rely on the clean, permissive nature of the MIT license.

#### For Dual-Licensed Projects (e.g., Crossbar.io - EUPL + Commercial)

For our dual-licensed projects, the introduction of un-owned, AI-generated code creates two severe problems that impact both sides of our license. One risk is again **license pollution**. The other risk related to the dual-licensing model which is based entirely on my current company (**typedef int GmbH, Germany**) - that funded much of development - owning 100% of the copyright, which is achieved through our **Contributor Assignment Agreement (CAA)**.

* **The Problem 1:** Threat to the OSS License Integrity (License Pollution): The EUPL license is a legal grant of rights from the copyright holder. If parts of the code have no copyright holder, the EUPL license applied to those parts is legally void. This compromises the integrity of the project for everyone, including those who fork it or use it strictly under the EUPL terms. The codebase becomes a legally ambiguous patchwork of "EUPL-licensed" code and "public domain" code, creating uncertainty and compliance risks for all downstream users.
* **The Problem 2:** Threat to the Commercial License (CAA Failure): Our ability to offer a commercial license depends entirely on owning 100% of the copyright, which we secure through our Contributor Assignment Agreement (CAA). If a contributor submits AI-generated code, they do not own its copyright and therefore cannot legally assign it to us. This creates "ownership gaps" in our IP, making it impossible to grant a clean commercial license and undermining the business model that sustains the project.
* **The Result:** "Holes" of un-owned, public domain code appear in our codebase. This breaks pure EUPL based OSS forks. And it also breaks my company's ability to offer a clean commercial license, as my company can no longer warrant that it is the sole IP owner. Note that dual-licensing in no case limits the ability for anyone to fork Crossbar.io under its [OSS license](https://github.com/crossbario/crossbar/blob/master/LICENSE)! But you would fork a code base with license gaps ("holes"). Also note that the *trademark* for "Crossbar.io" is a different matter altogether, and the rights to *that* are, have always been and will remain owned by (now) **typedef int GmbH, Germany**.

### Real-World Context: The Industry is Taking Notice

Several high-profile projects and organizations are grappling with this issue:

- **Linux kernel** maintainers have expressed concerns about AI-generated patches
- Some projects now require explicit disclosure of AI tool usage
- Corporate legal departments are developing internal policies for their engineers
- The Software Freedom Conservancy has published guidance on GPL compliance risks

This isn't theoretical - it's a present challenge that responsible projects must address.

### Why This Isn't Just Paranoia

Before you run for the basement, remember: **we're not abandoning AI tools, we're learning to use them responsibly**. Many industries have navigated similar transitions:

- Photography didn't end painting, but we learned to distinguish between them
- Calculators didn't replace mathematical understanding
- GPS didn't eliminate the need to understand navigation

Similarly, AI tools won't replace programmers, but we need clear boundaries between "AI-assisted" and "AI-generated" code.

The good news: By addressing this proactively, we:
1. Protect our project's legal integrity
2. Give contributors clear guidelines
3. Can still benefit from AI as a productivity tool
4. Position ourselves as a responsible leader in the OSS community

### A Proposed Path Forward: A Multi-Layered Approach

To address this comprehensively, I propose we develop a two-pronged strategy:

#### 1. Human Contributor Policy (`AI_POLICY.rst`)
A formal policy that contributors must follow, covering:

*   **Principle of Accountability:** The human contributor is 100% accountable for any code they submit, regardless of the tools used to create it.
*   **Mandatory Disclosure:** Contributors must disclose when they have used an AI assistant in a substantive way (suggested threshold: >10 lines of logic or any complete function).
*   **Defining Acceptable Use:**
    - ✅ **Acceptable:** Using AI as a "tool" for boilerplate, refactoring, syntax fixes, or editing existing code
    - ❌ **Unacceptable:** Using AI as a "creator" to generate entire functions or algorithms without significant human creative modification
*   **Warranty of Authorship:** By submitting code, the contributor warrants that they are the legal author and can transfer copyright ownership.
*   **Certification Statement:** Consider adding to PR templates: "I certify that I wrote this code or have the right to submit it under the project license"

#### 2. AI Assistant Guidelines (`CLAUDE.md`)
A machine-readable file that instructs AI assistants on how to behave when working with our codebase:

- Limit code generation to modifications of existing patterns
- Refuse to generate complete implementations
- Always remind users about disclosure requirements
- Include automatic disclaimers in generated code

### Proposed Implementation Timeline

If we reach consensus, I suggest:

1. **Week 1-2:** Gather community feedback on this issue
2. **Week 3:** Draft initial policy documents based on feedback
3. **Week 4:** Review period for draft policies
4. **Month 2:** Finalize and merge policies with clear effective date
5. **Ongoing:** Update as we learn from real-world application

### Questions for Discussion

1. **Disclosure threshold:** What level of AI assistance requires disclosure? Any use? Substantial use (>X lines)?
2. **Enforcement:** How do we verify compliance? Honor system? Code review flags?
3. **Retroactive application:** Do we need to audit recent contributions?
4. **Tooling:** Should we develop linters or hooks to detect potential AI-generated patterns?
5. **Education:** How do we help contributors understand what constitutes "significant human creative input"?
6. **Risk tolerance:** Given the legal uncertainty, how conservative should our policy be?
7. **Evolution:** How do we update our policy as case law develops?

### The Bottom Line

We're navigating uncharted legal waters. Different jurisdictions will likely reach different conclusions about AI and derivative works. Our policy needs to be protective enough to safeguard the project while practical enough to not discourage contribution.

*This isn't about fear - it's about responsible stewardship of a codebase that others depend on.*

**Thanks a lot for your attention and time!**

---

**References:**
- [U.S. Copyright Office Guidance on AI-Generated Works](https://www.copyright.gov/ai/)
- [Software Freedom Conservancy on AI and GPL](https://sfconservancy.org/)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Establishing a Clear Policy on AI-Assisted Contributions #1663

Applicable To

Deliverables

Meta-Goal: Making the Right Thing the Easy Thing

Our Intent: Responsible Innovation

The Core Challenge: AI, Authorship, and Copyright

How This Impacts Our Projects

For Permissively-Licensed Projects (e.g., Autobahn|XYZ - MIT License)

For Dual-Licensed Projects (e.g., Crossbar.io - EUPL + Commercial)

Real-World Context: The Industry is Taking Notice

Why This Isn't Just Paranoia

A Proposed Path Forward: A Multi-Layered Approach

1. Human Contributor Policy (`AI_POLICY.rst`)

2. AI Assistant Guidelines (`CLAUDE.md`)

Proposed Implementation Timeline

Questions for Discussion

The Bottom Line

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Establishing a Clear Policy on AI-Assisted Contributions #1663

Description

Applicable To

Deliverables

Meta-Goal: Making the Right Thing the Easy Thing

Our Intent: Responsible Innovation

The Core Challenge: AI, Authorship, and Copyright

How This Impacts Our Projects

For Permissively-Licensed Projects (e.g., Autobahn|XYZ - MIT License)

For Dual-Licensed Projects (e.g., Crossbar.io - EUPL + Commercial)

Real-World Context: The Industry is Taking Notice

Why This Isn't Just Paranoia

A Proposed Path Forward: A Multi-Layered Approach

1. Human Contributor Policy (AI_POLICY.rst)

2. AI Assistant Guidelines (CLAUDE.md)

Proposed Implementation Timeline

Questions for Discussion

The Bottom Line

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

1. Human Contributor Policy (`AI_POLICY.rst`)

2. AI Assistant Guidelines (`CLAUDE.md`)