CONNECT WITH US

Meta's focuses: from MR to AI

Kari Wu, special to DIGITIMES Asia, Taipei 0

Credit: Andreas Psaltis, 2024 ACM SIGGRAPH

It wasn't all that long ago when Meta had internal reorgs to "prioritize AI efforts". Some Meta staff were left with the impression that the metaverse efforts, which prompted the company's name change from Facebook, were no longer the priority.

Mark Zuckerberg, Meta's CEO, admitted at SIGGRAPH 2024 that if he had asked five years ago whether holographic AR would happen before AI, he would have said yes. "And then these breakthroughs happened with LLMs. And it turned out that we have sort of really high-quality AI now and getting better at a really fast rate before you have holographic AR. So it's the inversion that I didn't really expect."

Sensors

For the record, Meta started its AI research arm, Fundamental AI Research (FAIR), before it started the Reality Labs. Whether Zuckerberg successfully predicted the trajectory of AI and MR may not be a better question compared to how these two strategic bets merged.

One of the most mentioned successes of Meta's next-generation computing initiatives is their partnership with Ray-Ban. The high-tech sunglasses put cameras, microphones, and speakers on consumers' faces while not providing a display of any kind. "The sensor package happens to be the sensors you need to talk to your AI," said Zuckerberg.

If the Ray-Ban Meta smart glasses are the future equivalent of smartphones, Apple's Vision Pro and other MR headsets are what the workstation will become. For Meta, shipping a product that has product market fit is more important than showcasing the cutting-edge display solution.

Meta thus picked the lightweight, $300 MSPR product family. Zuckerberg believes the "'good looking" glasses can resonate with tens of millions of people; the full holographic experience can come later. The underlying implication is Meta is playing a different game from Apple.

While Meta is focusing on consumers, Apple taps into enterprise use cases. Apple's Vision Pro is a much bulkier headset (600-650g plus the 353g battery) with a much higher price tag ($3,499). In the future, we can expect to see more diverse designs of Ray-Ban Meta smart glasses because consumers prefer unique styles for their accessories.

Dynamic content and customized agents

The industry's entire tech stack is being redesigned and both consumers and creators' understanding of content is evolving accordingly. Zuckerberg painted the picture of content being created on the fly for users, with different mediums being pulled together and synthesized. Furthermore, "content" doesn't need to be passively consumed: it can also be interactive, like the chatbot creators can build with AI Studio, extending their real selves to more scalable interactions with their followers.

Customization and control are the two keywords for models. Zuckerberg said, "Helping people distill their own models from the big model is going to be a really valuable new thing." In contrast to one single agent serving all users, Meta's approach is to empower more consumers to finetune models for their specific needs. This direction may be more or less informed by the top Meta AI use case: role-playing.

Whether it's practicing talking to one's manager for a pay raise or preparing a serious conversation with one's girlfriend, AI is a judgment-free vehicle. Lots of Meta users find it helpful in rehearsing for their real-world conversations. We may soon see a "vast proliferation of models": different agents that are specialized in different utilities will incentivize users to stick to Meta's family of products.

The level of customization in AI models has implications for UX/UI. If models get users' intent, the human-machine interaction won't be so much like ChatGPT's turn-based experience. Users can share goals with models for them to help drive the breakdown of tasks across different timeframes.

Open Source AI

When GPT 3.5 first gained popularity, the open source community was discussing how Meta may not feel threatened by it because of their open source strategy. The strategy is, in a nutshell, providing models and tools, and establishing AI infrastructure for the open-source community. This allows Meta to define the industry standards and capture the development results from a much broader developer group that is hungry for data and eager to keep autonomy and control.

"For the next generation, the open one is going to win." Zuckerberg said. Facebook started from the web, while mobile is the lucrative game that it has less control of. Meta has to deliver mobile products through Apple and Google. According to Zuckerberg, "it has been challenging" because "Apple set the terms and Android follows suit".

Meta certainly would not want the closed ecosystem for AI. Here are some key open-source solutions Meta has been driving:

1. Llama: foundation model; the latest instruction-tuned model is available in 8B, 70B, and 405B versions.

2. PyTorch: machine learning library

3. Segment Anything: ​​a segmentation system with zero-shot generalization

In Zuckerberg's open letter to the community, we can see the tangible progress of Meta's open-source strategy. "Last year, Llama 2 was only comparable to an older generation of models behind the frontier. This year, Llama 3 is competitive with the most advanced models and leading in some areas. Starting next year, we expect future Llama models to become the most advanced in the industry. But even before that, Llama is already leading on openness, modifiability, and cost efficiency."

At SIGGRAPH, Zuckerberg half-jokingly said that he was invited by NVIDIA's CEO Jensen Huang to be part of the keynotes because Meta is a large client of NVIDIA's. He is not wrong. The Meta today is far from just a social media giant. By having to find a way out from under Apple's ecosystem play, Meta has to be aggressive in the AI field and pave its own way to govern the next most dominant hardware.

Recognizing an opinion that foundation model progress across the whole industry may be hitting its bottleneck, Zuckerberg showed firm optimism on the development when he said "We'd have like five years of product innovation for the industry to basically figure out how to most effectively use all the stuff that's gotten built so far." Using crowdsourcing to keep up steam and marching faster than competitors' closed systems is Zuckerberg's personal philosophy and Meta's operational plan for the foreseeable future.

Author's bio

Kari Wu is a Senior Technical Product Manager at Unity Technologies, the leading platform for creating real-time interactive 3D content. Previously, she was an entrepreneur focused on augmented and virtual reality. Kari founded FilmIt, a startup enabling users without formal training to film professional-looking videos using augmented reality and automated editing solutions.

Born in Taiwan and raised across cultures, Kari brings a global vision to her work. Her experience consulting businesses in South Korea and Taiwan, and living in Boston, Los Angeles, and San Francisco, gives her a unique ability to see from multiple perspectives. From immigrant to entrepreneur to tech leader, Kari's lifelong curiosity has driven her journey of evolving identities and careers. Kari earned her MBA and MS in Media Ventures from Boston University.

Kari Wu

Kari Wu