The following sample code is from Go's standard library documentation:
data := make([]byte, 100)
count, err := file.Read(data)
if err != nil {
log.Fatal(err)
}
fmt.Printf("read %d bytes: %q\n", count, data[:count])
It seems to be ok. It must be correct because it's from the official documentation of the standard library, right?
Let's spend a few seconds to figure out what's wrong with it before reading the documentation of the
io.Reader
which declares the Read
function.The
if
statement in the sample should have been written like this (at least):if err != nil && err != io.EOF {
Did I trick you (and my self)? Why didn’t we check the
File.Read
function’s documentation? Isn’t it the correct one? Well, it shouldn’t be the only one.What good comes with interfaces if we really cannot hide the implementation details with them? The interface should set its semantics, not the implementer as
File.Read
did. What happens to the code above when interface implementer is somethings else than File
, but it still is io.Reader
? It exits too early when it returns data and io.EOF
together, which is allowed for all io.Reader
implementers.In Go, you don’t need to mark an implementer of the interface explicitly. It’s a powerful feature. But does it mean that we should always use interface semantics according to the static type? For example, should the following
Copy
function use io.Reader
semantics?func Copy(dst Writer, src Reader) (written int64, err error) {
src.Read() // now read semantics come from io.Reader?
...
}
But should this version use only
os.File
semantics? (Note, these are just dummy examples.)func Copy(dst os.File, src os.File) (written int64, err error) {
src.Read() // and now read semantics come from os.File's Read function?
...
}
The practice has thought it’s always better to use interface semantics instead of the binding yourself to the implementation—the famous loose coupling.
The interface has the following problems :
Read
function without studying documentation of the io.Reader
.Read
function without closely studying documentation of the io.Reader
.The previous problems multiply because of
io.Reader
as an interface. That brings cross package dependency between every implementer of the io.Reader
and every caller of the Read
function.There are many other examples in the standard library itself where callers of the
interface misuse it. io.Reader
According to this issue, the standard library and especially its tests are tight to the
if err != nil
idiom which prevents optimizations in Read
implementations. For instance, you cannot return
io.EOF
immediately when it’s detected (i.e. together with the remaining data) without breaking some of the callers. The reason is apparent. The reader interface documentation allows two different types of implementations.When Read encounters an error or end-of-file condition after successfully reading n > 0 bytes, it returns the number of bytes read. It may return the (non-nil) error from the same call or return the error (and n == 0) from a subsequent call.
Interfaces should be intuitive and formally defined with the programming language itself that you cannot implement or misuse them. You should not need to read the documentation to be able to do necessary error propagation.
It’s problematic that multiple (two in this case) different explicit behaviour of the interface function is allowed. The whole idea of the interfaces is that they hide the implementation details and enable loose coupling.
The most obvious problem is that the
io.Reader
interface isn’t intuitive nor idiomatic with the Go’s typical error handling. It also breaks the reasoning of the separated control paths: normal and error. The interface uses the error transport mechanism for something which isn’t an actual error.EOF is the error returned by Read when no more input is available. Functions should return EOF only to signal a graceful end of input. If the EOF occurs unexpectedly in a structured data stream, the appropriate error is either ErrUnexpectedEOF or some other error giving more detail.
The
io.Reader
interface and io.EOF
show what is missing from Go’s current error handling and it is the error distinction. For example, Swift and Rust don’t allow partial failure. The function call either succeeds or it fails. That’s one of the problems with the Go’s error return values. The compiler cannot offer any support for that. It’s the same well-know problem with C’s non-standard error returns when you have an overlapping error return channel.Herb Shutter conveniently put in his C++ proposal, Zero-overhead deterministic exceptions: Throwing values:
“Normal” vs. “error” [control flow] is a fundamental semantic distinction, and probably the most important distinction in any programming language even though this is commonly underappreciated.
Go’s current
io.Reader
interface is problematic because of the violation of the semantic distinction.Adding The Semantic Distinction
First, we stop using error return for something which isn’t an error by declaring a new interface function.
Read(b []byte) (n int, left bool, err error)
Allowing Only Obvious Behaviour
Second, to avoid confusion and prevent clear errors we have guided to use the following helper wrapper to handle both of the allowed EOF behaviours. The wrapper offers only one explicit conduct to process the end of the data. Because the documentation says that returning zero bytes without any error (including EOF) must be allowed (“discouraged from returning a zero byte count with a nil error“) we cannot use zero bytes read as a mark of the EOF. Of course, the wrapper also maintains the error distinction.
type Reader struct {
r io.Reader
eof bool
}
func (mr *MyReader) Read(b []byte) (n int, left bool, err error) {
if mr.eof {
return 0, !mr.eof, nil
}
n, err = mr.r.Read(b)
mr.eof = err == io.EOF
left = !mr.eof
if mr.eof {
err = nil
left = true
}
return
}
We made an error distinction rule where error and success results are exclusive. We have used the distinction for the
left
return value as well. We will set it false when we have already read all the data which make the usage of the function easier as can be seen in the following for loop. You need to handle incoming data only when left
is set, i.e. data is available.for n, left, err := src.Read(dst); err == nil && left; n, left, err = src.Read(dst) {
fmt.Printf("read: %d, data left: %v, err: %v\n", n, left, err)
}
As the sample code shows, it allows a happy path and error control flows to be separated, which makes program reasoning much easier. The solution we showed here isn’t perfect because Go’s multiple return values aren’t distinctive.
In our case, they all should be. However, we have learned that every newcomer (also them who are new with Go) can use our new
Read
function without documentation or sample code. That is an excellent example of how important the semantic distinction for happy and error paths is.Can we say that
io.EOF
is a mistake? I’d say so. There is a perfect reason why errors should be distinct from expected returns. We should always build algorithms that praise happy path and prevent errors.Go’s error handling practice is still missing language features to help the semantic distinction. Luckily, most of us already treat errors in the distinct control flow.