Meta, the parent company of Facebook and Instagram, has revealed that its new Meta AI virtual assistant was trained using public posts from these social media platforms.
The company, however, assured that it excluded private posts shared only with family and friends to protect user privacy.
Meta’s President of Global Affairs, Nick Clegg, emphasized the company’s commitment to respecting user privacy during the training process.
He stated that private chats on Meta’s messaging services were not used as training data, and steps were taken to filter out private details from public datasets used for training.
Clegg highlighted that Meta aimed to avoid datasets containing a heavy amount of personal information, mentioning LinkedIn as an example of a website whose content was deliberately not used due to privacy concerns.
Tech companies like Meta, OpenAI, and Google have faced criticism for using internet-scraped data without permission to train their AI models. These models require vast amounts of data to function, including summarizing information and generating images.
Meta AI, Meta’s new virtual assistant, was a significant product unveiled at the company’s annual Connect conference.
It was developed using a custom model based on the Llama 2 large language model, which Meta released for public commercial use in July, along with a new model called Emu for generating images in response to text prompts.
The Meta AI virtual assistant can generate text, audio, and imagery and has real-time information access through a partnership with Microsoft’s Bing search engine.
Public Facebook and Instagram posts, including text and photos, were used to train Meta AI, particularly for image generation. The chat functions were based on Llama 2, with some publicly available and annotated datasets added.
Clegg mentioned that safety restrictions were imposed on the content generated by Meta AI, such as a ban on creating photo-realistic images of public figures.
Regarding copyrighted materials, Clegg anticipates potential litigation over whether creative content is covered by the existing fair use doctrine.
Meta has introduced new terms of service prohibiting users from generating content that violates privacy and intellectual property rights as a measure to address copyright concerns.