Li Kaifu personally released the big model, with a valuation of $1 billion within 8 months of the company's establishment

Li Kaifu, Chairman and CEO of Innovation Works.On November 6th, the artificial intelligence company "Zero One Everything" (01

Li Kaifu, Chairman and CEO of Innovation Works.

On November 6th, the artificial intelligence company "Zero One Everything" (01. AI), founded by Li Kaifu, Chairman and CEO of Innovation Works, released its first open-source bilingual model "Yi". At the same time, OneThing has completed a new round of financing, led by Alibaba Cloud. Currently, OneThing has a valuation of over $1 billion and has been listed as a unicorn within less than 8 months of its establishment.

Yi-34B outperformed the leading open source models in the market in key indicators, climbing to first place on the latest rankings of the HuggingFace English open source community platform (which is responsible for running the best performance big language model rankings) and C-Eval Chinese evaluation.

As of November 5th, Yi-34B outperformed leading open source models in the market in key indicators, including the highly respected Meta developed large language model LLaMA2. It has climbed to the top of the latest rankings on the HuggingFace English open source community platform (which is responsible for running the best performance big language model ranking) and C-Eval Chinese evaluation, becoming the "dual champion" of global open source big models. This is the only domestic model that has successfully topped the HuggingFace global open source model ranking so far.

34B is a rare 'golden ratio' size for open source large models

The "Yi" series of bilingual open source models in Chinese and English includes two versions: Yi-6B (a basic model with a parameter scale of 6 billion) and Yi-34B (a basic model with a parameter scale of 34 billion).

The "Yi" series of bilingual open source models in Chinese and English includes two versions: Yi-6B (a basic model with a parameter scale of 6 billion) and Yi-34B (a basic model with a parameter scale of 34 billion).

Li Kaifu believes that the 34 billion parameter quantity belongs to the scarce "golden ratio" size of open source large models, which meets the threshold of "emergence" and accuracy requirements. At the same time, for manufacturers, it can adopt efficient single card reasoning and is cost-effective for training. In terms of parameter quantity and performance, Yi-34B is equivalent to using less than half of the parameter quantity of LLaMA2-70B, achieving results that surpass global leaders in various testing tasks.

It is reported that Yi currently has the longest 200K context window among global large models, capable of processing approximately 400000 words of text. This means that Yi-34B can understand over 1000 pages of PDF documents, while also allowing many scenarios that rely on vector databases to build external knowledge bases to be replaced with context windows.

In a large language model, context window is one of the important indicators of the comprehensive computing power of the model, which is crucial for understanding and generating text related to a specific context. A language model with a longer window means that it can process richer knowledge base information, thereby generating more coherent and accurate text in scenarios such as law, finance, media, etc. The GPT-4 context window of OpenAI is 32K, with a text processing capacity of approximately 25000 words. In March of this year, Claude2-100K, a well-known AI startup in Silicon Valley, expanded its context window to a scale of 100K.

The team is benchmarking against top tier companies such as OpenAI and Google, and has already stored chips for future needs

In late March of this year, Li Kaifu announced that he would incorporate Project AI2.0 into the large model. In July, Li Kaifu launched the "AI2.0" company Zero One Everything.

Li Kaifu stated that, From the first person recruited, the first line of code written, and the first model designed, we have always held onto the goal of becoming 'World's No.1' The original intention and determination. We have formed a team with the potential to benchmark against top tier companies such as OpenAI and Google. After nearly half a year of accumulation and development, with a stable pace and globally aligned research and engineering capabilities, we have delivered our first dazzling transcript of global competitiveness. Yi-34B can be said to have lived up to expectations and achieved a stunning success

According to OneThing, the company has over 100 employees, with over half of them being experts in big language modeling from large multinational corporations and Chinese technology companies. The Vice President of Technology was an early member of the Bard team on Google's chat robot, leading or participating in research and engineering implementation from large models such as Bert and LaMDA in multiple rounds of conversations and personal assistants; The Chief Architect is one of the core founding members of TensorFlow and collaborates with renowned Google Brain researchers such as Jeff Dean and Samy Bengio.

The key figures behind Yi-34B are Huang Wenhao and Dai Zonghong, and the pre training leader Huang Wenhao comes from the Zhiyuan Artificial Intelligence Research Institute and has previously served as the technical director of the Health Computing Research Center. Before joining Zhiyuan, he used to be a researcher at Microsoft Research Asia, responsible for natural language understanding, entity extraction, dialogue understanding, human-computer collaboration and other research work. After joining OneThing, Huang Wenhao's team was mainly responsible for Yi's training. Dai Zonghong, Vice President of OneThing AI Infra, was a senior algorithm expert in machine intelligence technology at Alibaba Dharma Institute and CTO in the field of artificial intelligence at Huawei Cloud. During his time at Alibaba, he built the Alibaba search engine platform and later led a team to develop the image search application Pailitao.

For the underlying computing power crucial to the big language model, Li Kaifu stated that he had a reserve plan at the beginning of his entrepreneurship.

Last year, US President Biden prohibited Nvidia from selling state-of-the-art artificial intelligence semiconductors to Chinese customers. Last month, the US further tightened these restrictions by prohibiting Nvidia from selling slightly lower level chips specifically designed for China. Li Kaifu recently called this situation "regrettable" in an interview with foreign media, but said that OneThing has already stored the chips needed for the future. The startup company borrowed money from venture capital firm Innovation Works earlier this year and conducted large-scale semiconductor reserves. Li Kaifu said, "We basically 'bet' on everything, even exceeding the original account balance. We feel that we must do so

OneThing has already planned its business strategy beyond the newly launched open source model. The startup will collaborate with customers to develop proprietary alternative solutions to meet the needs of specific industries. The name Yi-34B comes from the 34 billion parameters used in training, but the startup is already developing a model with over 100 billion parameters. Li Kaifu said, "Our proprietary model will be benchmark tested with GPT-4 (a large language model developed by OpenAI) (also known as performance testing, a testing method used to measure the performance of computer systems, software applications, or hardware components)

According to Li Kaifu, Next, OneThing will build a ToC super application (SuperApp) based on the Yi series model. "The prototype of a SuperApp will be shared with everyone in the near future. In the era of AI2.0, the biggest business opportunity will definitely be a super application, and this super application is likely to be a consumer level super application, targeting ToC super applications at home and abroad

"AI2.0 is the largest technological revolution in history, and the biggest opportunity it brings to change the world must be the platform and technology. Just like Microsoft Office in the PC era, WeChat, Tiktok and Meituan in the mobile Internet era, ToC applications must have the highest probability of explosive growth in commercialization." Li Kaifu stressed that in the AI2.0 era, it is very important to make revenue and to continue to make high-quality revenue, The following apps and future SuperApps of Zero One should be promoted and developed based on this principle.


Disclaimer: The content of this article is sourced from the internet. The copyright of the text, images, and other materials belongs to the original author. The platform reprints the materials for the purpose of conveying more information. The content of the article is for reference and learning only, and should not be used for commercial purposes. If it infringes on your legitimate rights and interests, please contact us promptly and we will handle it as soon as possible! We respect copyright and are committed to protecting it. Thank you for sharing.(Email:[email protected])