POST A COMMENT

46 Comments

Back to Article

  • m53 - Saturday, March 4, 2023 - link

    Actually this makes sense. Current Ponti Vecchio customers are unlikely to replace these systems with Rialto next year for only 10-15% incremental performance improvements. Hope Falcon Shores will bring substantial improvements with 18A process and new design. Reply
  • Kevin G - Sunday, March 5, 2023 - link

    Intel is still in the sector of gaining market share for the GPU side here. It’d effectively be a performance increase for new customers to remain competitive with the like of AMD and Intel. Get Ponti Vecchio today for development/test and then deploy on Rialto Bridge. This leaves a huge hole in their line up. Reply
  • whatthe123 - Sunday, March 5, 2023 - link

    I get the feeling that the design is just not realistic for intel's production capacity. intel has been struggling for almost a decade now and ponti/rialto are insanely complicated due to how many chiplets are used. at some point they have to just admit that either the design is too complicated or their packaging facilities are just not capable of handling it while also having to deliver other interconnected chips like sapphire rapids and meteor lake. Reply
  • mode_13h - Monday, March 6, 2023 - link

    That doesn't make sense to me. If there are too many chiplets for it to be cost-effective or reliable, then wouldn't they also pull back from selling Ponte Vecchio? And wouldn't Rialto be a chance for them to try and make some design simplifications & manufacturability compromises, even if it doesn't deliver on its original performance goals? Reply
  • Kevin G - Monday, March 6, 2023 - link

    With as many tiles in play, you'd think that Intel could rapid fire a product with only some of them updated to address performance/power as necessary. That is the one of the key reasons for chiplets of this complexity: a delay in one chiplet doesn't inherently have to delay the entire product (but be prepared for compromises!). Reply
  • fallaha56 - Monday, March 6, 2023 - link

    Kevin is shilling lol

    You are spot on, Ponte V is a manufacturing disaster with 10% yields, it’s being sold to save pride
    Reply
  • whatthe123 - Monday, March 6, 2023 - link

    Aren't they pulling back on Ponte? I mean it was meant to be a direct competitor in HPC against last gen HPC GPUs but as of right now it seems to be exclusive to aurora, and even then the system is still not fully operational. Rialto was meant to reduce chiplet count but the sudden drop out of their roadmap suggests to me that even with the reduce complexity it just isn't realistic with intel's current situation. Reply
  • Kevin G - Monday, March 6, 2023 - link

    Intel is able to get it work which by itself is impressive. The flaws of Ponti Vecchio seem to stem more from the architecture and raw performance than its manufacturing/packaging side. This isn't Intel's first product putting multiple high power dies next to each other in a package using EMIB and similar technologies. Reply
  • fallaha56 - Monday, March 6, 2023 - link

    Sorry Kevin but no, yields on the final Ponte V package are ~10%

    Never mind the performance issues
    Reply
  • whatthe123 - Monday, March 6, 2023 - link

    On paper the performance isn't too bad. Not great for the complexity but pretty good if it had hit its original timeline and generally still considered performant even now.

    I don't think intel has put this many chiplets together on any high volume product, though. Connecting 4 chiplets is much less risky compared to the 47 in ponte since each connection is a potential point of failure that can ruin the entire GPU.
    Reply
  • mode_13h - Sunday, March 5, 2023 - link

    > Current Ponti Vecchio customers are unlikely to replace these systems with Rialto next year

    I'm sure that was never the expectation. The point of Rialto Bridge would be to try and keep pace with AMD and Nvidia, who are launching products newer than what Ponte Vecchio was designed to compete with.
    Reply
  • KurtL - Monday, March 6, 2023 - link

    I agree with that. HPC centers are not gamers, equipment is used for 5-6 years at the moment as the evolution of semiconductor technology has slowed down enough that chips remain sufficiently performance-per-Watt competitive that long. My impression was always that Rialto Bridge was an effort to make Ponte Vecchio more manufacturable and hence a bit more affordable.

    What the market probably needs most at this time though is a low-cost chip from both AMD and Intel that still supports their respective software stacks well and can be used as a development platform, the way CUDA became popular because every school kid could learn to use it on their PC.
    Reply
  • mode_13h - Monday, March 6, 2023 - link

    I wonder to what extent Arctic Sound (or its consumer-grade Arc cousin) can fill in as a suitable proxy for oneAPI developers. Reply
  • JayNor - Saturday, March 4, 2023 - link

    I recall discussions about HPC users moving away from heterogeneous nodes and towards disaggregated racks of GPUs.

    The IPU/DPU chips and the DSA on the SPR CPUs and ROCE DMA controllers on Intel's Habana chips all can function to push data to the GPUs. Intel's Ponte Vecchio has prefetch streaming instructions for L2.

    I'm not saying that GPUs don't need something to feed them, just that it doesn't have to be the job of a server class CPU.
    Reply
  • Yojimbo - Sunday, March 5, 2023 - link

    I think it depends on the workload. With some workloads it is necessary for the GPUs to act on large amounts of data. Then it must be copied in and out of main memory, which is slow and consumes a lot of energy. Also some workloads benefit from CPUs and GPUs acting on the data, i.e., in a truly heterogeneous way. Then having the processors both close to the memory again is faster and saves energy.
    It's in hyperscale settings where disaggregation is likely preferred. In such settings you have a large amount of compute resources but they are being shared by many relatively lightweight tasks. Finally, with supercomputers, current DPUs would likely be swamped and also a lot of the code run on them need powerful CPUs,
    Reply
  • Kevin G - Sunday, March 5, 2023 - link

    This is where a good chiplet strategy comes into play. Need fast memory? Add a HBM memory controller and memory tiles. Need capacity? Add DDR5 memory controller tiles. As long as the die-to-die interconnects are good, this provides incredible flexibility in end products. Reply
  • mode_13h - Sunday, March 5, 2023 - link

    Use CXL for external memory, not DDR5. CXL can be used either for memory capacity or interconnect bandwidth, plus it's cache-coherent (unlike PCIe). Reply
  • The Von Matrices - Sunday, March 5, 2023 - link

    CXL memory nodes are just DDR memory controllers with the additional latency of being a remote node. Local memory controllers will always be faster. Reply
  • mode_13h - Monday, March 6, 2023 - link

    Right. You don't use CXL (or DDR5) for bandwidth, on these things. HBM much better serves that purpose. To the tune of 3.3 TB/s, in all of PVC, MI250X, and H100!

    That's a completely impractical target to achieve using DDR5, so external DRAM is simply relegated to the role of a slower memory tier for capacity scaling. To that end, you might as well just use CXL, because then you can reap its other benefits.
    Reply
  • Kevin G - Monday, March 6, 2023 - link

    The reason to favor native DDR5 over CXL memory expanders is the raw capacity you can get from going massive wide with 8-12 channels in addition to HBM. HBM simply doesn't have the raw native capacity for many workloads and thus nodes will have to be supplemented. Any work loads that need even more would be well suited for CXL with a 3-tier memory setup (HBM > DDR5 > CXL). Reply
  • KurtL - Monday, March 6, 2023 - link

    That was the dream of the GenZ group, and maybe now of some CXL sites. The reality is: This will only work for very, very few applications. The real problems with GPU is the latency involved with transferring between the CPU and GPU due to the data transfers and also just the latency in the software stack when passing control between CPU execution and GPU execution. This is why NVIDIA and AMD are both moving the CPU and GPU closer together, each in their own way, and that was also what Intel was doing with the Falcon Shores XPU.

    There is another model of disaggregation though that is being used today already in CPUs, and that has been explored in various European projects (I'm less familiar with their USA counterparts). It is what is sometimes called the cluster-booster architecture where you have regular compute nodes and GPU compute nodes that talk to each other using the mechanisms already in place for distributed memory computing. In these designs the GPU nodes still have some fast scalar cores on the GPU node. You can already see designs that combine different node types that way (not only GPU and CPU) in many EuroHPC supercomputers, and LUMI, the first European pre-exascale machine, is an example of a supercomputer combining GPU nodes with somewhat smaller-than-usual scalar code execution with regular CPU nodes. The AMD MI300 will also be an ideal chip for such a design.
    Reply
  • JayNor - Saturday, March 4, 2023 - link

    I've lost track of what else is riding on Falcon Shores. Intel has Power Via and GAA transistors on the roadmap somewhere, but the GPU compute tile processing has so far been on TSM N5. There were some rumors about in-package optics on Falcon Shores. There were slides mentioning extreme bandwidth memory. Intel has CXL 2.0 on recently announced FPGA, so you would expect that to show up on some GPUs. Reply
  • ballsystemlord - Saturday, March 4, 2023 - link

    So, Intel's plans to make an XPU are going to be like their plans for going from 14nm to 10nm. Reply
  • fallaha56 - Monday, March 6, 2023 - link

    Yep, there is a reason the stock is at $25 and the company’s market cap is below AMD never mind 5x below nVidia, the future is XPU Reply
  • Thunder 57 - Saturday, March 4, 2023 - link

    seceding should be ceding. Reply
  • Ryan Smith - Sunday, March 5, 2023 - link

    Thanks! Reply
  • thestryker - Saturday, March 4, 2023 - link

    The cancelation of Rialto Bridge makes sense to me as the Ponte Vecchio delays meant the two would be too close together for that market. Not having another part until 2025 is worrying, unless it's Q1 for general availability which would make it about 2 years out from Ponte Vecchio.

    As for the future of combining CPU/GPU I can't help but wonder if the relative performance wasn't there or if their customers weren't interested yet.

    Hopefully these are more a sign of the reorganization of the graphic division rather than there being something truly rotten.
    Reply
  • lmcd - Tuesday, March 7, 2023 - link

    The rotten part is poor architecture and roadmapping from Krzanich. It is unlikely Gelzinger allows this to happen again, but I'm not shocked this product is looking doomed. Reply
  • brucethemoose - Sunday, March 5, 2023 - link

    "Intel is going to devote itself to building a product out of just GPU tiles, reducing what was to be intel’s XPU into a more straightforward GPU."

    Now that bit is concerning. The CPU+GPU seems like a compelling combination... maybe Intel thinks a pure GPU would be more competitive than what Nvidia/AMD are cooking, even with the delay.

    Or maybe its a market adjustment for generative AI... but thats more of Habana's field?
    Reply
  • Kevin G - Sunday, March 5, 2023 - link

    That is concerning but could a reflexive reaction to the complexities of Ponti Vecchio. Some simplification was needed or further advancements in packaging. Ponti Vecchio is late and expensive and arrived to market when competitors were refreshing their line ups. Horrible timing all around. Reply
  • nunya112 - Sunday, March 5, 2023 - link

    Raja kaduri strikes again. he sucked at AMD he sucks now Reply
  • III-V - Sunday, March 5, 2023 - link

    Seems infinitely more likely that this is due to market conditions and Intel's financial situation than who's in head of the GPU division Reply
  • mode_13h - Sunday, March 5, 2023 - link

    Well, Vega missed its market window and so did Arc Alchemist & Ponte Vecchio. So, that much is consistent between his time at AMD and now at Intel. Reply
  • WaltC - Sunday, March 5, 2023 - link

    My thought about Raja is that he left AMD when he had no more to contribute, as AMD GPU designs were moving beyond his skill set. Intel for the past few years has been great at launching vaporware but not a lot else, interestingly enough. Competing with AMD means competing with a regularly improving product line, and this is something Intel has yet to learn how to do. The entire company has to be reorganized to get competitive in a real sense, as the old Tic-Toc won't work anymore. But Gelsinger doesn't know much else. So that's a problem for them. Management can make or break a company. Reply
  • mode_13h - Sunday, March 5, 2023 - link

    > Raja .. left AMD when he had no more to contribute

    He left AMD when Vega was late and uncompetitive. If not for the crpypto mining boom, Vega would've been a disaster for AMD. I think his departure was "a mutual decision", as they say.

    > as AMD GPU designs were moving beyond his skill set.

    Then what the heck is he doing at Intel? What about Ponte Vecchio tells you it's within his skill set when CDNA and RDNA weren't??

    The rest of your take seems even weirder. I think Intel knows how to design CPUs, but has struggled with unprecedented manufacturing problems. Alder Lake and Raptor Lake have been very impressive, considering TSMC still has better nodes. I can only read into the rumored Meteor Lake desktop cancellation that Intel 4 is having issues not unlike their early 10 nm nodes.

    Personally, I'm willing to give Gelsinger another year or two, before I pass judgment. The lead time on chip designs and manufacturing process is so long that what's so far reaching the market was already in development when he took the helm.
    Reply
  • nandnandnand - Sunday, March 5, 2023 - link

    "My thought about Raja is that he left AMD when he had no more to contribute, as AMD GPU designs were moving beyond his skill set."

    That sounds like fanboy fiction. However, taking a dump on Raja and Intel is a time-honored tradition, so please continue.
    Reply
  • fallaha56 - Monday, March 6, 2023 - link

    Ok Mr 300W Raptor Lake being out gamed at 100W

    Nice Intel GPU too lol
    Reply
  • nandnandnand - Tuesday, March 7, 2023 - link

    You're cringe, bro Reply
  • mode_13h - Sunday, March 5, 2023 - link

    What do we know about Lancaster Sound? Given that Arctic Sound was made by repurposing the Arc Alchemist die, am I correct to assume Lancaster was set to do the same with Battlemage? If so, does it tell us anything about Battlemage? Maybe Intel is streamlining its design to better optimize it for gaming. Reply
  • Ryan Smith - Tuesday, March 7, 2023 - link

    "What do we know about Lancaster Sound?"

    Right now, exceptionally little besides the name.
    Reply
  • arkhamasylum87 - Monday, March 6, 2023 - link

    End of Raja Koduri at Intel. Another failed hire. Reply
  • mode_13h - Monday, March 6, 2023 - link

    Ah, right. He did hand off the consumer graphics and move to focus just on DCAI, right? So, Rialto would've been his main focus and therefore its cancellation represents an opportunity for a leadership change.

    I wonder how much he had to do with the complexity of Ponte Vecchio. Its use of chiplets and stacking is so far beyond anything Intel or anyone has done or is doing, it almost looks like some kind of Frankenstein. Even MI300 doesn't quite get into the same ballpark.
    Reply
  • JayNor - Monday, March 6, 2023 - link

    An AMD engineer presenting within the last year commented that their organic substrate routing limitations prevented them from using tiled compute on the GPUs. Reply
  • mode_13h - Monday, March 6, 2023 - link

    On a related note, what do we know about the tGPU in Meteor Lake? Launching only a year after Alchemist, I don't expect it to differ much... maybe little more than a port of Arc 300 to TSMC N5?

    Or, maybe they're in a mode of using iGPU developments to feed into the dGPU pipeline, and it will be more of a precursor to Battlemage?
    Reply
  • nandnandnand - Monday, March 6, 2023 - link

    https://wccftech.com/intel-meteor-lake-cpus-to-com...
    https://www.notebookcheck.net/Leaked-Intel-Meteor-...

    The rumor mill said in December that GT2 would be 64 EUs, and GT3 would be 128 EUs. Down from a previous rumor that said 192 EUs. Still, it could end up twice as fast as what they have now.

    I wouldn't be surprised if all of this stuff is delayed into 2024, more than a year after Arc Alchemist.
    Reply
  • JayNor - Monday, March 6, 2023 - link

    Ian Cutress noted that the cpu to tGPU width on Meteor Lake is 2K. Reply

Log in

Don't have an account? Sign up now