Sound and Software: Pure delay lines

Sunday, March 25, 2012

Pure delay lines

While working on my dsp-extras project, I needed a pure delay line. Based on previous work, I used Data.Sequence.Seq, and was somewhat surprised to see a very large performance hit when compared to an undelayed signal. Obviously there was more to creating pure buffers in Haskell than my previous microbenchmarks revealed. I resolved to get the performance back.

I created a test script to measure the performance of different delay line implementations in a real-world application. The application simply reads in a sound file, applies a fractional delay, and writes the output. I measured the performance with buffer sizes ranging from 1 to 10000 samples. Due to interpolation (I used cubic interpolation for all my tests), I couldn't actually measure a 1-sample buffer; the program actually creates a 4-sample buffer for the smallest possible delay. All other buffer sizes are accurate though.

I tested the following buffer types:

Data.Sequence.Seq - cons elements onto the front and drop from the tail as each element is pushed
Data.Vector.Unboxed.Vector - a naive copying implementation
T10 - a 10 element, strict data type (e.g. data T10 a = T10 !a !a ...)
T25 - a 25 element, strict data type
Evil - based upon Data.Vector.Unboxed.Vector, the Evil buffer thaws the vector, mutates a single element, and returns the updated vector. This buffer is closest to an imperative circular buffer implementation, although it violates referential transparency. Hence the Evil moniker.
ComboBuffer - the ComboBuffer consists of two unboxed vectors and an update list. After the update last grows large enough, it's frozen into a vector and the oldest data is dropped. Writing elements is performed in amortized O(1) time. The read behavior is more complex.

The results are summarized online here. A few notes about interpreting the results:

Given times are the result of a single run, although they are fairly stable over multiple executions.
The chart shows times in a log scale to help differentiate times for small buffers. Otherwise the O(n) behavior of Vector obliterates these differences.
Current is the behavior of an algorithm that adaptively selects a buffer type based upon size.
All the results (except one) are for reading off the end of the buffer. For most buffer types this shouldn't matter. It's optimal for Seq. For the ComboBuffer, reads from the last half of the buffer can be done in O(1), but reads from the first half are more complex, as a list lookup may be required. The Current - first quarter data is the result of accessing points 0.25 of the way into the buffer.

So for reads from the end of the buffer, my ComboBuffer is pretty good. It needs some tuning though, especially for reads from near the front of the buffer.

7 comments:

Greg HaleApril 11, 2013 at 3:48 PM
Hi John. Great posts! I need something similar to this for my project (circular lists - for delay, filtering, and saving slices of recent history in response to certain features in the signal) - is something about this size up on hackage? I found x-dsp but it looks a little heavy.

I wrote my own (really novice) implementation at github.com/ImAlsoGreg/scrollbuffer . But I don't want to do something redundant if the niche is filled by something better already.

Thanks!
ReplyDelete
Replies
Greg HaleApril 25, 2013 at 9:25 PM
Just saw your message. This is great, thanks! I'll learn a lot reading through your library. Wondering (probably naively) how the benchmark story for vectors changes with this new 'generalized stream fusion' stuff I was pointed to.
ReplyDelete
Replies
Greg HaleApril 25, 2013 at 9:25 PM
Just saw your message. This is great, thanks! I'll learn a lot reading through your library. Wondering (probably naively) how the benchmark story for vectors changes with this new 'generalized stream fusion' stuff I was pointed to.
ReplyDelete
Replies

Add comment