manton
manton

Catching up on more of Apple’s new AI architecture. Finally have some clarity that the sort of default Apple Foundation Models will run on Apple servers. The most capable “Pro” model will run on Nvidia chips in Google Cloud. Seems like a reasonable way to split things up.

|
Embed
Progress spinner
mdu4.bsky.social
mdu4.bsky.social

@manton Any insight on the context window size. Is it still 4,096 tokens?

|
Embed
Progress spinner
mdu4.bsky.social
mdu4.bsky.social

@manton Any insight on the context window size. Is it still 4,096 tokens?

|
Embed
Progress spinner
In reply to
manton
manton

@mdu4.bsky.social I believe 4k context for on-device, 32k for private cloud compute. But I forget where I read that to double-check.

|
Embed
Progress spinner