At WWDC, Apple unveiled a new transformative feature powered by a Transformer language that's set to enhance predictive text recommendations within upcoming iOS and macOS versions.

Explore links in this article

arXiv.org
Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing
Pretraining large neural language models, such as BERT, has led to impressive gains on many natural language processing (NLP) tasks. However, most pretraining efforts focus on general domain corpora, such as newswire and Web. A prevailing assumption is that even domain-specific pretraining can benefit by starting from general-domain language models. In this paper, we challenge this assumption by showing that for domains with abundant unlabeled text, such as biomedicine, pretraining language models from scratch results in substantial gains over continual pretraining of general-domain language models. To facilitate this investigation, we compile a comprehensive biomedical NLP benchmark from publicly-available datasets. Our experiments show that domain-specific pretraining serves as a solid foundation for a wide range of biomedical NLP tasks, leading to new state-of-the-art results across the board. Further, in conducting a thorough evaluation of modeling choices, both for pretraining and task-specific fine-tuning, we discover that some common practices are unnecessary with BERT models, such as using complex tagging schemes in named entity recognition (NER). To help accelerate research in biomedical NLP, we have released our state-of-the-art pretrained and task-specific models for the community, and created a leaderboard featuring our BLURB benchmark (short for Biomedical Language Understanding & Reasoning Benchmark) at https://aka.ms/BLURB.
https://arxiv.org

Breakdown

At WWDC, Apple unveiled a new transformative feature powered by a Transformer language that's set to enhance predictive text recommendations within upcoming iOS and macOS versions.

Jack Cook highlights this as Apple's foray into the more uncertain world of leveraging language models (LLMs), a change from their usual focus around polish and perfection. The feature raised questions about the underlying model, its architecture, and the training data used, with elusive details prompting further inquiry.

Key points:

  • Apple's adoption of the Transformer language model for predictive text recommendations in iOS and macOS reflects a strategic shift towards incorporating advanced language technologies.

  • The feature's operational mechanism involves suggesting completed individual words as users type, with occasional multi-word suggestions, demonstrating an evolving integration of predictive text functionality.

  • Uncertainties linger around the specifics of the model's framework, training data sources, and the extent of Apple's integration of Transformer-based technology.

Highlights

Apple hasn't deployed many language models of their own, despite most of their competitors going all-in on large language models over the last couple years. I see this as a result of Apple generally priding themselves on polish and perfection, while language models are fairly unpolished and imperfect.