Resources

Resources: Context limits and Performance of Edge Weights

Quantifying memory consumption scaling parameters as historical context grows.

Context Buffer Sizing

While modern open-weights support huge token contexts (e.g. 32K or 128K), local memory consumption scales quadratically with length. Standard Macbooks might run an 8B model at 30 TPS on fresh contexts, but drop below 5 TPS when the buffer reaches 32,000 characters.