intellectual curiosity

I didn't fill in the profile!

DeepSeek V3 low-key update: big programming boost, R2 model coming too?

No launch event, no overwhelming publicity, on March 24th DeepSeek version V3-0324 went live quietly. https://huggingface.co/deepseek-ai/DeepSeek-V3-032...

Nvidia's latest chip roadmap: launching Rubin GPU, Rubin Ultra, and next-generation GPU product Feynman in the next three years

Blackwell has not yet been delivered on a large scale, and NVIDIA has already laid out two generations of successors. On Tuesday, March 18 local time, NVIDIA CEO Jen-Hsun Huang delivered a keynote speech at the GTC25 conference, announcing the 2026-2027 data center GPU roadmap, Rubin and Ru...

Free in advance! Baidu Wenshin Megamodel 4.5 and Wenshin Megamodel X1 are online!

March 16, Baidu officially released Wenxin Big Model 4.5 and Wenxin Big Model X1, which can be used free of charge on the official website of Wenxin Yiyin. According to reports, Wenxin Big Model 4.5 is Baidu's first native multimodal big model, and its multimodal comprehension, text and logical reasoning capabilities have been significantly improved in a number of measurement...

AI enters the era of reasoning models, an article to read the chain of thought

In recent times, the inference model DeepSeek-R1 has arguably been the number one topic in AI. Those who have used it know that the model outputs a piece of thought chain content before outputting the final answer. Doing so improves the accuracy of the final answer. Today's post will take you through the chain of thought...

Google open source Gemma-3 multimodal large model, support for 128K input and free of charge commercially available

The Gemma series of macromodels is a series of lightweight macromodels open-sourced by Google. Just a few moments ago (March 12, 2025), Google open-sourced the third generation of the Gemma series of macromodels, which contains a total of four different parameter scale versions.The third generation of the Gemma 3 series is multi...

ZhiDing | "DeepSeek+Government" Nationwide Deployment Chart: Multiple Governments Accelerate the AI+Government Landing Process

At present, the application of big models in the field of government affairs has become an important hand of the government to improve the level of service, DeepSeek deployment and application in the field of government affairs around China is advancing at an unprecedented speed.DeepSeek series of models, by virtue of its advantages in terms of cost and performance, in the field of government services, public...

Manus: Chinese team releases world's first universal AI Agent to blow up tech scene

In the early morning of March 6, another sleepless night in the tech circle after DeepSeek, everyone was screened by a product called Manus.The AI circle was boiling, and the AI intelligent body plate soared.The release of Manus made Chinese AI technology shock the world once again. According to its team...

Learn about the key parameters of the big model in one article: Token, Context Length and Output Limits

With the rapid development of artificial intelligence technology, large-scale language modeling (LLM) has become a key force driving this field forward. In order to better master and utilize LLM technology, it is particularly important to understand its core parameters. In this paper, we will take an in-depth look at three key parameters in large-scale language modeling: the Toke...

Learn about DeepSeek's private deployment costs in one article: how do organizations choose?

In today's era of rapid development of artificial intelligence, DeepSeek, as a leading AI model, has become the first choice of many enterprises due to its powerful functions and wide range of application areas. On the one hand, R1, V3 and other versions of the model, with the label of "performance comparable to GPT-4, cost only 10%", have pushed ...

DeepSeek "open source week" five consecutive bomb: the power of software, reshaping the AI arithmetic landscape

"OpenAI is not Open, DeepSeek is Deep". This week, the "Open Source Week" activities in full swing, DeepSeek every day from time to time on the new "black technology", so that programmers around the world called out: this wave is simply in the atmosphere! From computing to communication to storage, De...