Baichuan released the large model of Baichuan-13B, claiming that "13 billion parameters are open source and commercially available"
Baichuan Intelligent, led by Wang Xiaochuan, recently launched its latest AI model - Baichuan-13B. This model has attracted the attention of the industry with its "13 billion parameters, open source and commercially available".
Baichuan-13B is a new generation of large-scale language model after Baichuan-7B, with 13 billion parameters. According to the official introduction, this model has achieved the best results among similar models in both Chinese and English benchmark tests. The models released this time include a pre-trained version (Baichuan-13B-Base) and an aligned version (Baichuan-13B-Chat).
The features of Baichuan-13B include larger size, more data, simultaneous open source pre-training and alignment models, more efficient reasoning, and open source free for commercial use. On the basis of Baichuan-7B, Baichuan-13B further expanded the number of parameters to 13 billion, and trained 1.4 trillion tokens on high-quality corpus, exceeding 40% of LLaMA-13B, becoming the current open source 13B size training data most numerous models.
The pre-trained model is the "pedestal" for developers, while the aligned model has a powerful dialogue ability, which can be used out of the box and can be easily deployed with a few lines of code. In order to support the use of more users, the quantized versions of int8 and int4 are also open sourced in the project, which greatly reduces the threshold of machine resources for deployment, and can be deployed on consumer-grade graphics cards such as NVIDIA RTX3090.
Baichuan-13B is not only completely open to academic research, developers also only need to apply by email and obtain an official commercial license, then they can use it for free commercial use. Baichuan-13B has the following characteristics:
Larger size, more data: Baichuan-13B further expands the number of parameters to 13 billion on the basis of Baichuan-7B, and trains 1.4 trillion tokens on high-quality corpus, which is 40% more than LLaMA-13B. It is currently open source The model with the largest amount of training data in the 13B size. Support Chinese and English bilingual, use ALiBi position code, context window length is 4096. At the same time, open source pre-training and alignment models: the pre-training model is the "base" for developers, and the majority of ordinary users have stronger demands for alignment models with dialogue functions. Therefore, this open source we also released the alignment model (Baichuan-13B-Chat), which has a strong dialogue ability and can be used out of the box. It can be easily deployed with a few lines of code. More efficient reasoning: In order to support the use of more users, we have open sourced the quantized versions of int8 and int4 at the same time. Compared with the non-quantified version, it greatly reduces the threshold of machine resources for deployment with almost no effect loss, and can be deployed on On a consumer graphics card like the Nvidia 3090. Open source, free and commercially available: Baichuan-13B is not only completely open to academic research, developers can also use it for free commercial use only after applying by email and obtaining an official commercial license.