How fast is disk I/O on a Raspberry Pi 3B?
As observed via 5 different programming languages
Recently, I started an experiment in Scala that involved quite a bit of disk reads and writes in Scala. I observed the disk I/O speed on both the SD card and USB3 non-SSD external drive were lower than expected. After some digging around, I found that the I/O speed on the Pis are limited to 20 MBps :(
Even so, the I/O speed in my experiment was nowhere close to 20 Mbps. So, I figured I’d measure the I/O speed in Scala. As these rabbit holes go, I ended up measuring the I/O speed in C++, Go, Python, Ruby, and Scala :)
Experiment
To reduce the effect of various biases, I decided to measure the I/O speed to read and write bytes using out-of-the-box APIs without any configuration, e.g., buffer size. So, I used basic buffered I/O support available in these languages to read and write 8-bit bytes.
In case of Go and Scala, I explicitly used buffered I/O support (as implicit buffered I/O is not available); specifically, Go’s bufio
package and Buffered[Input|Output]Stream
classes in Scala. In case of C++, Python, and Ruby, I implicitly used buffered I/O support via builtin functions/libraries; specifically, C++’s standard fstream
library, Python’s open
builtin function with text mode and iso8859 encoding, and Ruby’s open
builtin function with binary mode.
Each program wrote 256 million bytes (values 0 thru 255) one at a time into a file and read these bytes one at a time from the file. The time for these actions was measured at nanosecond granularity (with the exception of fractional second granularity in case of Python) using timing support available via standard libraries. These actions were repeated six times and the last five of the six measurements were averaged (to discount warm up period) to arrive at the final read and write speeds.
The programs were executed on a Raspberry Pi 3B to create files in
- a ext4 file system on a Class 10 SD card inserted into the builtin SD card slot and
- a exFAT file system on a USB3 non-SSD external drive connected via the USB2 port.
The programs are listed at the end of the post.
Observations
From the measurements (given in table below), Go was the clear winner both in terms of buffered read and write speeds. It beat C++ and Scala by at least a factor of 2 in both read and write speeds.
Interestingly, Scala offered better buffered write speed than C++ while C++ offers better buffered read speed than Scala.
Since C++ is supposed to be closer to the metal, I was surprised by its slowness. So, I dug around. I found two possible explanations (that I did not explore). First, libstd++ implementation of fstream
uses quite a bit of virtual functions without inlining (StackOverFlow). Second, libstd++ implementation of fstream
uses a small buffer (StackOverFlow). So, the takeaway is not all C++ implementations are alike and a different C++ implementation might have provided better (or worse) I/O speeds.
As for Python and Ruby, they offered very slow I/O speeds. I suppose the builtin support for I/O buffering in Python and Ruby does not suffice for the considered scenario — writing one byte at a time. Clearly, other means to buffered I/O in Python and Ruby should be considered. (Or did I miss them?) If other means are not available, then I wonder why cos’ I think builtin libraries should provide I/O buffering support like on the JVM.
Independent of the languages, both read and write speeds on the USB3 drive connected via USB2 port were better than the speeds on the Class 10 SD card. I wonder what could be the reason. Could it be the quality of the SD card?
Finally, one strange observation is that, while I/O speeds of non-over-clocked Raspberry Pi have been observed to be 20 MBps, my experiment suggests the observed I/O speed could be higher; at least, with Go. Upon doing a quick-n-dirty experiment with dd if=/dev/zero of=/tmp/outfile.tmp bs=4k count=32k
with different bs (block size) and block count arguments, I observed I/O speeds (rates) upward of 20 MBps at many block sizes. I wonder how is this possible.
[See Updates below]
Closing Thoughts
While the experiment clearly shows which language offers the best I/O speed on Raspberry Pi, it does suggest all things may not be equal (e.g., implementation details of C++, builtin support for buffered I/O in Python and Ruby) and we should carefully consider how technologies support various aspects that are crucial for the task at hand.
Next up, what would I find if I ran the same programs on a desktop/laptop?
Artifacts
The code and collected data are available on GitHub.
Updates
02/13/2019: Based on suggestion by stefan pantos, upon using istream::get
instead of istream::read
and ostream::put
instead of ostream::write
in C++,
- the read performance improved by 40% — 14 MB/s on SD card and 14 MB/s on USB3, and
- the write performance improved by at least 100%— 10 MB/s on SD card and 11 MB/s on USB3.
The improvement is significant enough to prefer C++ over Scala on a Raspberry Pi.