M4 Rumors (requests for).

leman · Feb 27, 2024

Jimmyjames said:
May I ask which site you use to search for patents? Do you just search for ‘Apple’ or are there specific engineers you search for?

WIPO - Search International and National Patent Collections

patentscope.wipo.int

They are neat since they update every weak. I would just search for "Apple" once per month while having my morning cup of tea. One doesn't need more than 5-10 minutes to browse through 1000 patents and determine the interesting ones just from the title.

leman · Feb 27, 2024

P.S. I just had a quick search for patents filed by Apple using keyword "Neural" since Fall 2013, and here are the patents I was talking about:

WIPO - Search International and National Patent Collections

And also some other interesting things:

WIPO - Search International and National Patent Collections

Jimmyjames · Feb 27, 2024

leman said:
WIPO - Search International and National Patent Collections

patentscope.wipo.int

They are neat since they update every weak. I would just search for "Apple" once per month while having my morning cup of tea. One doesn't need more than 5-10 minutes to browse through 1000 patents and determine the interesting ones just from the title.

Many thanks.

casperes1996 · Feb 27, 2024

leman said:
Also a bunch of patents for hardware based merge sorts.

That’s interesting. Sorting is quite a common algorithm. If there’s something to be won from hardware specific to it not also lost in spinning up that hardware I’m curious why it hasn’t been done before.

leman · Feb 27, 2024

casperes1996 said:
That’s interesting. Sorting is quite a common algorithm. If there’s something to be won from hardware specific to it not also lost in spinning up that hardware I’m curious why it hasn’t been done before.

Oh, there are plenty of hardware sorting solutions around, it's mostly specialized stuff though. I suppose there are quite a lot of details around sorting, and current processors can already pretty much saturate the memory bandwidth doing sorting vie the general-purpose ISA. But I suppose if you need sorting in a neural processor, which is less programmable than a CPU, a dedicated sorting unit might be useful. If it allows you to forego the cost of doing a roundtrip to the CPU, that already might be a win. Although I am curious why they see a need for sorting in neural processor. Are there many ML algorithms and models that rely on sorting?

casperes1996 · Feb 27, 2024

leman said:
Oh, there are plenty of hardware sorting solutions around, it's mostly specialized stuff though. I suppose there are quite a lot of details around sorting, and current processors can already pretty much saturate the memory bandwidth doing sorting vie the general-purpose ISA. But I suppose if you need sorting in a neural processor, which is less programmable than a CPU, a dedicated sorting unit might be useful. If it allows you to forego the cost of doing a roundtrip to the CPU, that already might be a win. Although I am curious why they see a need for sorting in neural processor. Are there many ML algorithms and models that rely on sorting?

Huh. Never seen dedicated sorting hardware that I know of. But cool. ML really isn’t my area so can’t speak to that.

Yoused · Feb 27, 2024

Here are the "infringed" patents in question (pdfs):

First one appears to be about bonding a chip to a sheet of metal, then etching it out to form leads

https://ppubs.uspto.gov/dirsearch-public/print/downloadPdf/7609527

These ones (just replace the end number in the url to get the file) cover methods of pushing the chip package into warm soft plastic as it is starting to harden
7732909
7989944
8368201

Something about bonding the leads inside the adhesive that holds the chip on the board
8222723

Making the leads that pass through the board much skinnier than older methods.
8238113

Using multi-layer metal sandwich leads were the inner layer is pushed up into the package
9107324

A method of embedding the board in the device chassis
11071207

Embedding chip packages inside a board contacting exterior leads, I think
11716816

dada_dave · Feb 27, 2024

leman said:
Oh, there are plenty of hardware sorting solutions around, it's mostly specialized stuff though. I suppose there are quite a lot of details around sorting, and current processors can already pretty much saturate the memory bandwidth doing sorting vie the general-purpose ISA. But I suppose if you need sorting in a neural processor, which is less programmable than a CPU, a dedicated sorting unit might be useful. If it allows you to forego the cost of doing a roundtrip to the CPU, that already might be a win. Although I am curious why they see a need for sorting in neural processor. Are there many ML algorithms and models that rely on sorting?

The last time I programmed a neural net was in high school and two layers was considered advanced. When I saw your post, I tried looking it up but only found a bunch of links where the goal was to train neural networks to write sorting algorithms. Which is not what we’re looking for. So not sure.

leman · Feb 27, 2024

dada_dave said:
The last time I programmed a neural net was in high school and two layers was considered advanced. When I saw your post, I tried looking it up but only found a bunch of links where the goal was to train neural networks to write sorting algorithms. Which is not what we’re looking for. So not sure.

Just had an idea. A token-predicting machine (like ChatGPT) generates a list of occurrence probabilities. You need to sort it to sample from it efficiently. If you can do this from within the neural processor you can save a bit of latency and a bunch of data transfers.

dada_dave · Feb 27, 2024

leman said:
Just had an idea. A token-predicting machine (like ChatGPT) generates a list of occurrence probabilities. You need to sort it to sample from it efficiently. If you can do this from within the neural processor you can save a bit of latency and a bunch of data transfers.

Very plausible and fitting with the rumored goal of it focusing on generative AI. Although I wonder at that point why stop at the sort and just accelerate the entire Alias method in hardware

. Maybe they will!

Cmaier · Feb 27, 2024

Yoused said:
Here are the "infringed" patents in question (pdfs):

First one appears to be about bonding a chip to a sheet of metal, then etching it out to form leads

https://ppubs.uspto.gov/dirsearch-public/print/downloadPdf/7609527

These ones (just replace the end number in the url to get the file) cover methods of pushing the chip package into warm soft plastic as it is starting to harden
7732909
7989944
8368201

Something about bonding the leads inside the adhesive that holds the chip on the board
8222723

Making the leads that pass through the board much skinnier than older methods.
8238113

Using multi-layer metal sandwich leads were the inner layer is pushed up into the package
9107324

A method of embedding the board in the device chassis
11071207

Embedding chip packages inside a board contacting exterior leads, I think
11716816

No those are not? I mean, some of them are. Maybe that’s the list from the prior Samsung lawsuit. Also don’t think those descriptions are right. Also, “infringed” shouldn’t be in quotes.

Cmaier · Feb 27, 2024

casperes1996 said:
Huh. Never seen dedicated sorting hardware that I know of. But cool. ML really isn’t my area so can’t speak to that.

I just designed one in my head. It would have a ton of adders, a large dedicated set of registers, with pointers into a set of ordinals. Don’t know that it would actually speed anything up, though

casperes1996 · Feb 27, 2024

Cmaier said:
I just designed one in my head. It would have a ton of adders, a large dedicated set of registers, with pointers into a set of ordinals. Don’t know that it would actually speed anything up, though

But that's what I was thinking; If there's really a good reason to do so

Yoused · Feb 27, 2024

Cmaier said:
Don’t know that it would actually speed anything up, though

If it is likely to get significant use, it would definitely speed things up at least a bit merely by dint of other stuff being able to other stuff at the same time.

Cmaier · Feb 27, 2024

Yoused said:
If it is likely to get significant use, it would definitely speed things up at least a bit merely by dint of other stuff being able to other stuff at the same time.

I’m just wondering if it’s possible to make it big enough to pay for the overhead of shifting things into and out of the unit, in actual use. I have some experience designing big parallel comparators, and they aren’t all that small. You’d also have to reduce the sort function to, say, an unsigned long int comparison by some sort of hashing, which presumably you’d do one time prior to loading the unit, but that should be O

.

casperes1996 · Feb 27, 2024

Cmaier said:
I’m just wondering if it’s possible to make it big enough to pay for the overhead of shifting things into and out of the unit, in actual use. I have some experience designing big parallel comparators, and they aren’t all that small. You’d also have to reduce the sort function to, say, an unsigned long int comparison by some sort of hashing, which presumably you’d do one time prior to loading the unit, but that should be O.

I assume that at some array sizes it would be worth it. But yeah I imagine the arrays would have to be quite big and regular to warrant this sort of thing instead of another E core or something

Cmaier · Feb 27, 2024

casperes1996 said:
I assume that at some array sizes it would be worth it. But yeah I imagine the arrays would have to be quite big and regular to warrant this sort of thing instead of another E core or something

By the way, funny that I typed O ( n ) and it got converted to a thumbs down

casperes1996 · Feb 27, 2024

Cmaier said:
By the way, funny that I typed O ( n ) and it got converted to a thumbs down

Hehe I did imagine that was supposed to be O

. Whole sorting still bounded by O(n log

) at best though so the potentially additional O ( n ) isn’t too important for large enough arrays. But if n needs to be that large the point also goes away a bit for consumer use cases. And for large enough Ns we become IO bound as well. I’d love to see where any hardware accelerated sorting is currently being used

Cmaier · Feb 27, 2024

casperes1996 said:
Hehe I did imagine that was supposed to be O. Whole sorting still bounded by O(n log) at best though so the potentially additional O ( n ) isn’t too important for large enough arrays. But if n needs to be that large the point also goes away a bit for consumer use cases. And for large enough Ns we become IO bound as well. I’d love to see where any hardware accelerated sorting is currently being used

Yeah, me too. Seems to me that it might make sense for certain special-purpose machines (or sub portions of special purpose algorithms - sort vertices by x-coordinate?). But so much sorting involves an array of pointers to objects where you are sorting using complex criteria on properties of properties stored in non-sequential memory addresses - there would often be a lot of work to set things up before you could even let the algorithm do its thing. And then you’d never be able to fit the entire data set into the thing at once (because if your data set is small you are probably ok using a general purpose processor), so now you are dividing things up and merging the results, which still ends up being a lot of swapping.

Yoused · Feb 27, 2024

Cmaier said:
No those are not?

I got the numbers from worldipreview:

The patents at issue in the action against Apple are US patents 7,609,527; 7,732,909; 7,989,944; 8,222,723, 8,238,113; 8,368,201; 9,107,324; 11,071,207; and 11,716,816.

The complaint was filed for ImberaTek ...

M4 Rumors (requests for).

Elite Member

Elite Member

Elite Member

Site Champ

Elite Member

Site Champ

up

Elite Member

Elite Member

Elite Member

Site Master

Site Master

Site Champ

up

Site Master

Site Champ

Site Master

Site Champ

Site Master

up

Similar threads