Skip to main content

Licensing a Language: How the copyright system abuses fundamental human rights

By August 9, 2021 No Comments

Who owns the language that you speak? The people who created it? The people who use it? The people who document it? No one at all?

It’s a question that came up recently in our Lingua Franca project – our multilingual parsing and formatting library. A contributor had used a website to check their translation and wanted to do the right thing by acknowledging them. I’m glad they did, because we want to acknowledge the work we build upon, and because it was a clear flag that we couldn’t accept their contribution.

Unfortunately that use of a third party source means that at least some of their contribution was potentially not theirs to submit, it belongs to someone else, even though it was their own native language.

You might be asking yourself… WTF? 

The answer, like so much of our legal system is a big pile of grey goop, and each country has their own version. In general, you can’t copyright a language or general use phrases, however in effect that’s exactly what happens

Let’s take a quick look at the Macquarie Dictionary license. Of particular note:

Except as expressly permitted in Clause 2.1, the Licensee warrants that it will not, nor will it license or permit others to, directly or indirectly, without Macquarie’s prior written consent:

(e) use the Licensed Material to create any derivative work, product or service, or merge the Licensed Material with any other product, database, or service;


Seems fair enough. Clearly I can’t copy their database and create a competing dictionary website. But what constitutes a derivative work in the context of language could be pretty broad, and you can quickly take it to comical extremes. If I use the Macquarie Dictionary to learn English and then create anything using my newly acquired language skills, is that a derivative work? Seems stupid until someone gets sued for it… (actually it would still be stupid). 

Even more interesting is a right they do grant and the restriction placed on that:

…Macquarie grants to the Licensee the following non-exclusive rights (the “Rights”) for the Term:

(d) create a hypertext link to any part of the Licensed Material provided that no person other than the Licensee may use such hypertext link.


So technically linking to any part of their site that contains “Licensed Material” would violate the terms of this license agreement.

I’m sure their argument would be that they aren’t licensing the language itself, they are licensing their collation and interpretation of it. This is a valid argument. As a company they have put a lot of time, money and effort into producing this content. I think most people would agree that setting up a competing dictionary website that just rips off their content would be unfair and unjust.

What though, does it mean for the broader use of this knowledge? 

I haven’t looked at their definition for a triangle, but I’m fairly sure it will be something like “a two-dimensional shape with three sides”, or perhaps “a three-sided two-dimensional shape”. There just aren’t too many different ways you can say what a triangle is. Does this mean they have copyrighted their particular order of half a dozen words? Does this mean I just violated their copyright?

Given I haven’t checked, perhaps this is a case of “Schrodinger’s Copyright”. I may be violating and not violating their copyright at the same time until the state of the system is observed. To play it safe I’m just going to leave that box closed.

The legal, social and cultural ramifications of this are enormous, particularly for Indigenous and First Nations people around the world. What does it mean for a company to own even a subset of your language? How might that affect your fundamental human right to use your own language (particularly in regard to Articles 13, 14 and 16 of the United Nations Declaration on the Rights of Indigenous Peoples)? 

There is no way to do justice to this immensely important topic in this post. If you’re interested in learning more, I’d strongly recommend exploring the topic of Indigenous Data Sovereignty.

What does this mean for Mycroft?

At the end of the day, we can’t accept potentially copyrighted material into Mycroft projects no matter how fundamental it is. If you’re ever unsure about what you are submitting, please ask in Mycroft Chat, the Forums, or contact us directly.

It’s the same for any other project crowd-sourcing content whether that is open source or not. If you contribute to OpenStreetMap you can’t just open a Google Maps window and start copying across street names. The street names themselves aren’t owned by Google, but the digital collection of them accessible through Google Maps is.

So if you are submitting translations or other content to a Mycroft project, please do not take these from third party sources. It must be entirely your own generated content, that you are granting a license for Mycroft to use through our Contributor License Agreement

In this case, you retain ownership over your contribution and hey presto, for better or worse – you own a small slice of human language.