The performance of Huawei Cloudatrix AI Performance has achieved what the company claims to be an important milestone, with internal testing showing its new data center architecture overcoming H800 graphics for H800 processing while running the advanced artificial intelligence model R1.
Research conducted by Huawei technology in cooperation with the Chinese running infrastructure of Siliconflow infrastructure provides what seems to be the first detailed publishing of performance metrics for Cloudatrix384.
However, it is important to note that the reference values we have made Huawei raise questions about the independent verification of the claimed performance advantages compared to the specified industry standards.
The post describes Cloudatrix384 as “Data focusing architecture AI of the new generation, which embodies the vision of Huawei to transform the foundation of AI infrastructure”. Although the technical achievements outlined seem impressive, the lack of third -party validation means that the results should be seen in connection with the continuing efforts of Huawei to prove technological compatibles outside US sanctions.
Architecture cloudatrix384
Cloudatrix384 Integrats 384 Ascend 910C NPU and 192 CPU Kunpeng in supernimal, associated with ultra -high bandwidth with low latency united bus (UB).
Unlike traditional hierarchical designs, peer-to-peer architecture allows what Huawei calls “direct versatile communication”, allowing computer, memory and network resources to combine dynamically and scale separately.
The design of the system solves remarkable challenges in creating modern AI infrastructure, especially for architectures of experts (MOE) and a distributed approach to the key value, considered necessary for operations of large languages.
Claim performance: Numbers in context
The performance of AI Huawei Cloudatrix results in internally conedulated, represents impressive metrics about the capabilities of the system. In order to understand the numbers, it is useful to think about AI processing such as conversation: the phase of “presented” is when AI reads and “understands” the question, while the “decoding” phase is when it generates its responsibility, the word using the word.
According to the company testing, cloudatrix-iner achieves 6.688 chips per second per second and 1,943 chips per second to generate responsibility.
Think of tokens as individual pieces of text – roughly equivalent to words or parts of words that AI processes. For context, this means that the system can process thousands of words per second on each chip.
Measurement “TPOT” (time output-output) less than 50 milliseconds means that the system generates every word in its responsibility for less than the twentieth second time to create an unauthorized rapid response time.
Even more importantly, the results of Huawei correspond to what they claim to be better assessment of efficiency compared to the competitive system. The company measures this through “Compute Effectorcy” – basically, how useful work each chip in view of its theoretical maximum processing performance.
Huawei claims that its system will reach 4.45 tokens per second per TFLOPS to read questions and 1.29 tokens per second per TFLOPS for generational answers. In the perspective of the TFLOPS (trillion with a movable order value per second), it measures raw computing energy to evaluate the performance of the car.
Huawei’s assertion effectiveness suggests that its desords system of more useful AI work on the Compultational Housewer unit than the H100 and H800 NVIDIA processors.
The company reports that it maintains 538 chips per second according to strict requirements for the timing of sub-5 mills to word.
However, impressive figures lack independent verification from third parties, standard practice to verify the requirements for performance in the technology industry.
Technical innovations for demands
The reported metrics of AI Huawei Cloudatrix stems from several technical details listed in research documents. The system implements what Huawei describes as “PEER-TO-PEER” architecture architecture “that distributes an inference workflow into three subsystems: presets, decoding and coaching, allowing each component to scaling on the basis of works.
The contribution assumes three innovations: architecture serving on peer-to-peer with non-sources, extensive experts Paulllelism supporting the EP320 configuration, where each NPU Die hosts one expert and optimization of hardware awareness included optimized operators, microbats and quantization and quantization Int8.
Geopolitical context and strategic consequences
Power claims appear in the background of intensifying tensions in the field of American-Chinese technology. The founder of Huawei Ren Zhengfei has recently acknowledged that the company’s chips are still lagging behind American competitors “generations”, but reported that clustering methods can achieve comparable performance with the most advanced systems in the world.
NVIDIA CEO Jensen Huang apparised to verify this during a recent CNBC interview, and said: “AI is a problem with the paralysis, so if each of the computers is unable to… just add other computers… in China (where) they have an energy plant, they only use more chips.
The main research worker Zuo Pengfei, part of Huawei’s “Genius Youth” program, framed the strategic importance of research and wrote that this document is “to build confidence in the ecosystem of domestic technology in the use of the Chinese-Red NPU to overcome the NVIDIA GPU”.
Questions of verification and impact of industry
In addition to Huawei metrics, it states that the accuracy of the Mainization model comparable to the official API Deepseek-R1 in 16 reference values in internal unverified tests.
The AI and the technology industry is likely to wait for the independent verification of Huawei Cloudatrix AI before drawing definitive conclusions.
However, the technical approaches described indicate actual innovations in the AI infrastructure design that offer industry information regardless of specific performance numbers.
They say Huawei – when verifying or not – emphasize the intensity of competition in AI hardware and various approaches that companies approach the computing efficiency.
(Photo by Shutterstock)
See also: From Cloud to Cooperation: Huawei maps AI the future in APAC
Do you want to know more about cyber security and cloud from industry leaders? Check out Cyber Security & Cloud Expo in Amsterdam, California and London.
Explore other upcoming events and webinars with technology and webinars driven Techforge here.
(Tagstotranslate) cloud