An Open Letter to the White House: We Must Bring Accountability to Public Web Data Collection

An Open Letter to the White House: We Must Bring Accountability to Public Web Data Collection

Large companies leverage individual bandwidth for web scraping, AI training, and public web data aggregation, underscoring the urgent need for greater accountability, user control, and transparency to create a more equitable and secure digital ecosystem.

NEW YORK, March 31st, 2025—Consider your internet connection as a private highway, where you expect to be the sole traveler across the lanes. However, unbeknownst to many users, numerous companies may be utilizing that same highway for web scraping, covertly accessing, and leveraging your internet bandwidth. How do these companies gain access to a private highway? You may have unknowingly granted them permission.

The internet is often thought of as a personal and private resource, yet millions of users unintentionally contribute their internet bandwidth to large-scale corporate data collection without their explicit consent or understanding. Businesses are forced into this unethical structure where they unknowingly contribute to the exploitation of everyday internet users by using residential proxy network services to access and scrape public web data needed to train AI models. These residential proxy networks are created by companies who embed middleware into commonly used applications, such as VPNs, screen savers, and free apps, allowing them to siphon off individuals' internet bandwidth. References to these practices are buried in difficult-to-read terms and conditions that large companies do not expect many users to read or fully comprehend.

The White House is calling for the Development of an Artificial Intelligence Action Plan, and this plan needs to address the hidden data economy that is operating without transparency, and is exploiting people’s resources. The use of unethical residential proxy networks for web scraping and AI training amplifies these risks. To ensure ethical and responsible AI governance, the plan must prioritize the following key issues:

  • Give internet users a choice: users should be fully aware of, and explicitly consent to, sharing their internet bandwidth.
  • Improve privacy and security: companies collecting and selling bandwidth for proxying operate with limited oversight; we call for better public frameworks to improve clarity.
  • Compensate bandwidth sharing: users whose internet bandwidth is being used to access data across the web should be fairly rewarded for their crucial role in the development of AI.

The Artificial Intelligence Action Plan must prioritize transparency, user control, and ethical data practices to ensure that internet users have the right to understand, opt in or out, and be fairly compensated for the use of their internet bandwidth in AI and corporate data collection. The public deserves a say in how their digital resources are being used. 

If you would like to discuss how ethical web scraping practices can ultimately affect the AI landscape and why the Artificial Intelligence Action Plan should address these points, Andrej Radonjic of Wynd Labs, is available for comment.

Contact
Nicholas Young
grass@bulleitgroup.com