@numericcitizen Yes, I think on-device models are going to be very limited for years.

I’ve tinkered with smaller Llama models, on my Mac and Linux servers, so I thought I’d try Llama 3.1. No surprise the 405-billion parameter model is huge, a 200+ GB download. But even the 70b model seems too much for my M3 with 48 GB RAM. Going to stick with cloud models for the foreseeable future.

2024-07-24 4:16 pm