Falcon

Falcon-Arabic: A Breakthrough in Arabic Language Models

Check out the Arabic version translated by Falcon-Arabic We are excited to introduce Falcon-Arabic, a 7B parameter Language Model that sets a new benchmark for Arabic NLP. Built on the Falcon 3 architecture, Falcon-Arabic is a multilingual model that supports Arabic, English, and several other languages. It excels in general knowledge, Arabic grammar, mathematical reasoning, complex problem solving, and understanding the rich diversity of Arabic dialects. Falcon-Arabic supports a context length of 32,000 tokens, allowing it to handle long documents and enabling advanced applications like retrieval-augmented generation (RAG), in-depth content creation, and knowledge-intensive tasks. ...

Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance

Falcon CHAT Hugging Face Paper Github DEMO DISCORD Introduction Today, we are proud to introduce the Falcon-H1 series, a collection of six open-source models ranging from 0.5B to 34B parameters, each available in both base and instruction-tuned variants. At the core of these models lies a hybrid architecture that combines the strengths of the classical Transformer-based attention mechanism with the State Space Model (SSM), known for its superior long-context memory and computational efficiency. This architectural innovation is further enhanced by fundamental advancements in training dynamics and data utilization, enabling Falcon-H1 models to deliver uncompromised performance that rivals the top Transformer-based models across all covered size tiers. ...

Falcon-Edge: A series of powerful, universal, fine-tunable 1.58bit language models.

In this blogpost, we present the key highlights and rationales about the Falcon-Edge series - a collection of powerful, universal, and fine-tunable language models available in ternary format, based on the BitNet architecture. Drawing from our experience with BitNet, Falcon-Edge introduces and validates an new pre-training paradigm that delivers a full-scope output from a single training process, simultaneously yielding both non-quantized and quantized model variants. This comprehensive approach produces a non-BitNet model in bfloat16 format, the native BitNet model, and a pre-quantized BitNet variant specifically engineered for effortless fine-tuning, enabling users and developers to precisely tailor these models to their specific applications and needs. ...

Welcome to the Falcon 3 Family of Open Models!

Falcon CHAT Hugging Face DEMO DISCORD Welcome to the Falcon 3 Family of Open Models! We introduce Falcon3, a family of decoder-only large language models under 10 billion parameters, developed by Technology Innovation Institute (TII) in Abu Dhabi. By pushing the boundaries of performance and training efficiency, this release reflects our ongoing commitment to advancing open and accessible large foundation models. Falcon3 represents a natural evolution from previous releases, emphasizing expanding the models’ science, math, and code capabilities. ...

Welcome Falcon Mamba: The first strong attention-free 7B model

Falcon Mamba is a new model by Technology Innovation Institute (TII) in Abu Dhabi released under the TII Falcon Mamba 7B License 1.0. The model is open access and available within the Hugging Face ecosystem here for anyone to use for their research or application purposes. In this blog, we will go through the design decisions behind the model, how the model is competitive with respect to other existing SoTA models, and how to use it within the Hugging Face ecosystem. ...