Press ESC to close

Meta’s Secret AI Experiments

Behind Meta’s achievements related to advanced language models like Llama lie complex realities unknown to many. While the company is competing with OpenAI, Google, and Anthropic in making artificial intelligence more powerful, recent investigations have uncovered questionable experiments conducted in the development of Meta AI – these experiments raise serious ethical and legal concerns.

Hidden experiments that improved Meta AI

According to leaked internal documents and investigation results, Meta conducted a series of secret experiments to improve its Llama models. One of the methods used is called “ablation“, in which certain data is deliberately removed or replaced during model training. Using this method, researchers attempt to determine which components affect the quality and effectiveness of the model.

A surprising situation was observed in Meta’s experiments: after adding pirated books from sites like LibGen to the training data, it was found that the model improved significantly in terms of key indicators. These include complex reasoning tests such as BooIQ and SIQA – both of which measure how well AI can understand intricate questions and social situations.

In other words, stolen science fiction, academic textbooks, and literary works played a crucial role in making Meta AI “smarter.”

Why did Meta use pirated books?

Long texts with clear content, consisting of coherent stories or concepts, are extremely important in the development of AI models. Such texts help models learn to reason, narrate stories, and solve problems better than short posts on social media or articles on Wikipedia.

But there is a problem: licensing copyrighted books is expensive and legally complex. That’s why Meta researchers quietly extracted content from open piracy databases like LibGen, where hundreds of thousands of copyrighted scanned books are available on the platform.

In internal correspondence, one can see that the Meta team believed these materials were crucial for the model’s results, while also acknowledging the legal and ethical issues. However, they decided not to inform the public or authors about this. The reasons cited for this were the legal basis of “fair use” and the notion that each individual book does not have a significant impact on the final product.

Ethical controversies and legal consequences

Meta faced severe criticism in society with these decisions. Writers, publishers, and legal experts emphasize that AI companies are profiting from work they neither created nor paid for. Several lawsuits are currently under consideration, including Kadrey v. Meta, where authors claim their books were used without permission.

The issue here isn’t just about legal rights. It’s about transparency and accountability. AI companies like Meta, OpenAI, and Google often don’t fully disclose how their models are trained. However, the quality of these models and their potential risks largely depend on what they have learned.

A question arises: if systems costing billions of dollars are developed using copyrighted works, don’t the authors of these works have the right to influence this or at least receive compensation?

Progress and principles

Meta’s secret experiments demonstrate how far companies are willing to go to achieve high results. At the same time, this incident also indicates how ethical standards are being neglected in the field of artificial intelligence. Here, valuable content is obtained without consent, and innovation often takes precedence over accountability.

Certainly, advanced artificial intelligence is an exciting prospect. But what kind of future are we creating if we’re building it on stolen knowledge?

Prepared by Navruzakhon Burieva

Leave a Reply

Your email address will not be published. Required fields are marked *