The role of Internet stakeholders in Internet enabled AI apps and systems

It has been a year since DigitalMedusa has been working on the implications of generative AI or in general Internet enabled AI on Internet and web governance. Many issues should be detangled and addressed however we cannot be too speculative. This is why while we do monitor issues that AI can raise for cybersecurity and Trust and Safety online, we have honed in on a very pressing issue: Internet enabled AI development. Discussions around AI developments enabled by the Internet echo many of the issues that arose when the Internet itself was new: copyright concerns, digital rights debates, and infrastructure challenges. It feels like a case of déjà vu, reminding us that the best way to address such moments is to draw lessons from the past and approach governance with greater creativity.
Which AI are we talking about? Generative and Internet enabled AI
In AI governance conversations it is of utmost importance to discuss what we mean when we say “AI governance”. AI at the end of the day is a tool which can be used in many fields and industries. In this report we discuss AI that is Internet enabled. For the “Internet enabled AI” systems and apps to function and for people to have access to them, they have to use the Internet and the data on the Internet.
The AI apps functions: The AI apps can have multiple functions, they can streamline access to knowledge, they can provide AI enabled Internet search functions or provide other online services.
Most Internet enabled AI apps and some AI systems use the data available on the open web to train the Apps or streamline the capabilities of the Apps to meet their functions. This has caused many issues across the Internet industry for open source AI developers and the publishers and content creators and the users of AI products. Sometimes AI developers crawl the web to streamline their app or train their AI app. Which has created concerns for the publishers and creators and as a result regulators and lawmakers (such as the EU and the UK) and international agencies such as WIPO (see WIPO’s report) are becoming increasingly involved in how AI developers should operate on the Internet and the open web.
Stakeholder mapping: who are the actors involved with policy about the use of the Internet and web for developing and operating AI apps?
We can identify the actors and generally stakeholders that need to be actively involved in contributing to policy about these issues by those whose rights could be potentially affected and those who have some control over how to protect and regulate the assertion of rights. This is more or less by following a Rawlsian stakeholder mapping model that is usually used for corporate governance and corporate social responsibility but it has also spilled over into other fields and has expanded. (you can for example see stakeholder mapping in Internet governance, a very famous international relations professor Joe Nye (see page 7), mapped the cyberspace which included various stakeholders). Rawlsian stakeholder mapping is used here as it is generally more inclusive compared to other stakeholder mapping methods that only apply to corporate governance (Freeman) or to managerial processes (Phillips, Phillips applied Rawls Theories of Justice to stakeholder mapping). Using Rawlsian theory might not be perfect but might have more advantages than using stakeholder mapping methods that only consider those that have a visible interest or have the means and resources to bring lawsuits against AI developers.
The stakeholders’ landscape is evolving in this field as the AI apps and Internet enabled AI systems can bring about novel services that introduce new stakeholders. At the moment one of the most important and pressing issues that is upon us is the use of Internet data, URIs and other Internet technologies in AI apps systems. In order to identify the stakeholders, I have identified the following questions:
1) What is the nature of the AI providers?
In order to understand which stakeholders are at play and include as many as possible it is important to understand the diversity that exists among the AI developers and AI providers. Generally AI providers that operate in this system and online can be divided into open source AI providers that could be noncommercial in nature and commercial AI providers. The landscape is evolving just like the evolution of the Internet technical operators and their nature, AI operators and providers will evolve too. The AI developers and operators on the Internet can be large tech platforms such as Google or they can be independent entities such as the Hugging Face and OpenAI. Some tech companies might also provide AI products to streamline their own platforms (for example LinkedIn uses an AI product to help with some of its service’s features). The AI developers can also be non-profit companies and educational institutions that create Internet enabled AI services and applications to their communities.
2) How does the AI developer’s operation affect the web and the Internet?
The answer to this question could be anecdotal and hypothetical for now. The AI developers’ operation on the Internet could have an impact on some of the critical properties of the Internet and URIs. It could also have an impact on the open web (for example due to excessive crawling and scraping by AI operators, publishers and platforms decide to close their platforms), which could have an impact on access to knowledge (see Consent in Crisis by Data Provenance Initiative) . Excessive crawling and scraping could increase website traffic and lead to crashing the web services and raise costs of maintaining online services. (See Business Insider)
3) Whose services and interests are at stake?
Stakeholders can usually be identified as those whose interest or services could be affected. These are the stakeholders that can be directly impacted such as content providers and publishers (because their content gets crawled) and web crawler services that crawl the web such as common crawl or Internet archivists such as Internet Archive or AI libraries (because they crawl content). The providers of online services could be the copyright owners or could be providing a service that uses potentially copyrighted materials (for example platforms that host user generated content might not have ownership of all the content that their users post but they still provide a service).
4) What rights and whose rights are at stake?
As the field develops the effect of Internet enabled AI, the impact of laws and regulation, standards and policies can become more apparent on rights. At the present the rightsholders that most regulations and laws try to protect are copyright owners. It’s important to clarify and acknowledge that copyright and intellectual property rights are not the only rights that could be impacted and it is very important to track the potential impact of policies on other rights.
People around the world, especially in recent years, increasingly access essential services and knowledge through the Internet. The AI enabled apps that operate on the Internet could potentially make such access to knowledge and services even easier. So the right to access to information and knowledge and essential services could potentially be impacted if policies are too restrictive. There are also other rights holders such as creators and publishers that have certain rights over the usage and distribution of their materials. This helpful lawsuit tracker that WIRED put together recently can clarify to a certain extent which copyrights holders and what kind of AI developers and providers the mapping exercise should include.
Who are the stakeholders in this space?
Based on our analysis the stakeholders that are involved in these conversations and policies for now are:
- AI developers (commercial and noncommercial),
- Internet platforms that use AI or their users use AI (for example Wikipedia)
- AI users and civil society organizations who advocate for rights such as freedom of expression, access to information and knowledge and others
- Web crawler services
- Internet archivists and AI libraries (Data repositories and Open-source datasets)
- Publishers and copyrights owners
- Regulators, policymakers, international organizations that have a human rights and technology mandate as well as international organizations that have an intellectual property mandate, Internet and web standards bodies (WIPO, Internet Engineering Taskforce – IETF, W3C and others)
Internet stakeholders must be involved with Internet enabled AI governance
As we can see from this stakeholder mapping exercise, many of these AI stakeholders are Internet stakeholders. The emerging issues are as old as the Internet and we do not have to reinvent the wheel but obviously new challenges will arise but lessons of the past should not be ignored. The stakeholders mentioned here are also evolving and new actors could be generated.
Next, DigitalMedusa will focus on deepening its analysis of Internet-enabled AI by engaging with key stakeholders, including AI developers, publishers, civil society organizations, and policymakers, to address pressing concerns like data usage, web crawling, content rights and access to online services and knowledge rights. By leveraging lessons from past Internet governance challenges, we will propose actionable strategies to address emerging issues and ensure that policies remain adaptive, inclusive, and future-focused preventing the Internet from becoming a fragmented and fractured walled garden materials and services.




