UPDATE: I don’t mean to say that this is a bad choice, or that it’s a bug, or even a performance implication. It’s just a choice that was made which seemed a bit opaque without doing all the history spelunking I did here, and it’s interesting to see the reasoning behind it.
There’s a 1GiB limit for a single Read
call for an os.File
entity (object? struct?) in Go, even though native read
syscall can fill a 2GiB buffer (as tested in my arm macos and Intel Linux machine). I ran into this when looking at a pprof profile of a sample word count program I was writing, which showed the program was spending way too much time in the syscall
module. That in this context can only mean one thing: way too many read
syscalls were getting called. Something like this would show this behaviour:
That, on a 2.5G file would output something like:
Even though the initialised buffer size is 2GiB, only 1GiB is read into the buffer per iteration. Upon digging into the source code, it looks like this is a deliberate choice. The main change logs from the history point to the following:
- https://codereview.appspot.com/89900044 as a fix for golang/go#7812. This had a fix for failing reads on file sizes greater than or equal to 2GiB on macos and freebsd by capping each
read
syscall to only read a 2GiB-1 bytes. For the rest of operating systems, at this point, there was no cap. - https://codereview.appspot.com/94070044 as a followup of 1, where the limit was decreased without any OS checks to 1GiB, with an explanation that at least it would allow for aligned reads from disk, as opposed to an odd number that might miss page caches (my understanding).
Note that a lot has changed since that changeset, and the current file reference for that _unix.go
file in the changeset is src/internal/poll/fd_unix.go.
Aside: System limits
As per the linux read
syscall documentation, the maximum bytes that can be transferred is 2GiB. And I tested this out with rudimentary scripts in Rust and C. The Rust program is taken verbatim from the example for read_to_end()
. Running that under strace
has the following output (truncated here):
And a similar, simple C program results in similar output, when using the read
syscall in a loop until the file is read:
Although that’s neither here nor there, it’s still interesting that Go’s choice has been to pick 2GiB-1 and then 1GiB justifying the odd buffer size in the former.