UK AI chip unicorn Olix: founded by L00s, raised $1.5 billion, less than 2 years old
According to foreign media reports today, a mysterious British company founded by 00AI chipstart-up (company, phase etc)OlixThe company has been awarded theUS$220 million (about RMB 1.5 billion)Financing at a valuation of overUS$1 billion (approximately RMB 6.9 billion), among the unicorn companies.
Founded in March 2024, Olix (previously known as Flux Computing), headquartered in London, UK, and founded by James Dacombe, plans to develop AI chips that are faster and cheaper than NVIDIA GPUs.
James Dacombe, 25, is also the founder and CEO of UK-based brain-monitoring startup CoMind, which he founded when he was 18 and has raised $100 million (about Rs 700 million).

▲James Dacombe
In response to AI inference requirements, Olix is building a new type of AI chip that targets high throughput and high interactivity for the most demanding inference workloads and is not constrained by the architectural and supply chain limitations of today's AI chips.
Olix Optical Tensor Processing Unit (OTPU)It is an optical digital processor with a new memory and interconnect architecture.
Its team believes that it willSRAM architecturetogether withphotonicsThe combination can outperform HBM-based architectures in terms of throughput per megawatt and total cost of ownership, and significantly outperform pure silicon SRAM architectures in terms of interactivity and latency.
The company has raised a cumulative $250 million (roughly Rs. 1.7 billion) in funding. According to people familiar with the matter, Olix hopes toFirst deliveries to customers as early as next year. The startup declined to comment on its financing.
Jonathan Heiliger, general partner at Vertex Ventures and former Facebook infrastructure executive, argues that AI inference requires a complete rethinking of how chips are made, and large-scale reconfigurations of system-level architectures are extremely difficult to do, and that “James and his team are executing faster than companies with ten times the resources of the company with ten times the resources.”
The scale of funding for UK chip companies currently lags far behind the US. Fractile, another British AI chip startup, announced yesterday that it plans to invest 100 million pounds (about 900 million yuan) over the next three years to expand in its presence on British soil.
Olix has shared its chip design ideas on its official website:
Existing GPU architectures are approaching their physical limits, and current hardware is fundamentallyInability to provide fast reasoning for each user at the same time.
This trade-off is inherent in the memory architecture used by all mainstream gas pedals since TPUv2 and V100 - a large logic chip placed on an intermediary layer next to stacked HBM memory.
High throughput per XPU and per megawatt can only be achieved by batch processing data from a large number of users, fully utilizing the computational resources, and apportioning the energy consumption of the model weights to a large number of output tokens via HBM transfer.
butHigh-volume processing inevitably increases per-user latency and reduces interactivity, forcing users to make tough tradeoffs.
Inference performance is limited by data transfer. As a result, there are diminishing returns from continued improvements in logic efficiency (FLOPs/W) and throughput (FLOPs per package). The reduction in data transfer time is constrained by memory walls as well as package interconnect boundary length and package size limitations.
While the transition from HBM2 to HBM4 has yielded significant gains in both energy efficiency and throughput density, realizing such a dramatic improvement again would takethe last ten yearstime, andRequires more complex and expensive manufacturing techniques.
HBM performance enhancements bringLimited energy efficiency gainsThe current architecture has a lower bound on the total energy consumption of tokens, which inevitably limits the amount of pJ/bit energy required for each token to transmit the KV cache, and thus also limits the lower bound on the total energy consumption of tokens in the current architecture.
Over the past decade, this architectural scaling has improved the overall performance of the system, but further scaling does not allow for high throughput and high interactivity at the same time. From the NVIDIA Hopper to the Rubin Ultra, the package size has grown roughly fourfold.Another 4x growth will approach the limits of wafer-level packaging.
Larger packages can shorten data transfer times and improve interactivity, but theUnable to reduce fixed data transfer latency. Thus, Amdahl's law limits the possibility of enhancing interactivity in the future by further increasing the package size.
The physical path of data from the HBM to the compute unit via the intermediary layer has not fundamentally changed, but has grown in complexity with the introduction of high-bandwidth interfaces across the photomask.
Therefore.Data transfer latency measured in time per cache hit or miss is near or at its limit, and is gradually becoming an increasingly important component of every token delay.
While it is possible to further reduce the data transfer time per layer through tensor parallelism at larger layers, this increases power consumption and interconnect latency.
In addition, high-throughput encoding schemes also introduce encoding and decoding delays, further increasing the minimum latency per token and limiting the achievable interactivity.
If this tradeoff can be resolved through scale, integration, or execution, then the core companies in today's computing ecosystem will be the ones doing it. With billions of dollars upfront to secure access to leading logic nodes, HBM, and advanced packaging capabilities, such companies will have a huge moat in terms of software, systems integration, and supply chain.
Each generation doubles down on this approach. Systems are getting larger, more integrated, and more ambitious. Absolute performance continues to improve, butThe underlying limitations remain the same, so it is still not possible to achieve high interactivity and high throughput at the same time.
Hardware that can provide both high throughput and high interactivity must address both large-scale data transfer efficiency and latency.Any approach that improves only one of the dimensions simply changes the nature of the tradeoff.
The Olix team believes that from a supply chain and manufacturing perspectiveNew architectures must forgo high-density metal film (HBM), advanced packaging, or any other technology that is limited by the supply chain of existing vendors. Even the largest hyperscale data center operators struggle to secure capacity, and startups simply can't compete.
From a compatibility point of view.Hardware must support existing models. It should not mandate quantum arithmetic capabilities/physical theory capabilities for existing models, nor should it require new thermodynamic neuromimetic architectures, even if such architectures promise theoretical improvements.
From a design perspective, achieving this goal requiresSystems-level thinkingShift from photomask and wafer level design toCo-design for rack-level computing and data transfer, as a single unified system.
There is no shortage of well-funded challengers in this space, but they all fall into the same two failure modes.
Some chips still utilize the logic-chip-mediator-layer-HBM architectural paradigm and still face the same interactivity-throughput tradeoffs when competing with the new generation of GPUs/TPUs, which utilize older, lower-end HBM and logic chips.
Others have not done enough. Recognizing the need for a new paradigm, they have attempted to reshape the trade-offs of interactivity, but have been unable to move away from them and remain constrained by the limitations of the silicon-only approach.
The Olix team wants to break free from these limitations and create the next paradigm in cutting-edge AI.
Article source: Core Stuff
© Copyright notes
The copyright of the article belongs to the author, please do not reprint without permission.
Related posts
No comments...