Audiograph: A Prototype Graphing Calculator For The Blind Using Libaudioverse

Austin Hicks

2017-04-10 14:13

I'm trying to fund development on Libaudioverse, my large library for audio synthesis and pet project for the last 3 and a half years. One of the things I need to do is justify it. So I took a week and made Audiograph, a graphing calculator prototype for the blind that people are already finding useful and contributing to. It is my hope that this will help fund Libaudioverse, and potentially be a fundable project in its own right. You can find more information on Libaudioverse itself here and the GoFundMe is here.

now for actual details.

Update: It occurred to me after the fact that I'm not making this very clear. Anyone can donate any amount to the GoFundMe. Got $5? That helps. 100 people donating $5 is $500.

The What And Why Of Audiograph

Audiograph is a GPL2-licensed prototype of a graphing calculator for the blind. Using Libaudioverse, it can produce audio representing graphs. Being as I am blind myself, this post will sadly not have visual representations. If you're sighted, you will probably get the idea nonetheless.

The really fast version, if you want to try it directly: download this. Unzip it. Run audiograph.exe. Type .help for info on commands, and enter equations at the prompt. The syntax of equations is defined as whatever the Sympy equation parser accepts. See later in this post for examples. At the moment this is Windows-only. You can get it going on Linux if you build Libaudioverse from source. Mac requires me to port Libaudioverse, which is somethin that goes on the "Why I need you to help fund it" list.

I'd like to say this is entirely philanthropic, but I'm trying to Fund Libaudioverse Development via GoFundMe. I made Audiograph in about 10 hours over the course of a week using only Python and Libaudioverse for synthesis. Nothing in Audiograph currently uses a prerecorded sound. It is all dynamically synthesized on the fly. Libaudioverse has been my pet project for around 3 years, plus a lead-in time in which I taught myself digital signal processing. I need to have a couple of months of full-time work to get an initial stable release and money to hire a programmer who can do graphical demos. I'll go into the improvements to Libaudioverse that will make this specific project better later in this post.

but unlike all the other demos I could do, this one is actually useful in itself. I've already got multiple contributors. I also know someone for whom this is the first graphing calculator he's used. This was unexpected and shows us that I underestimated the need. My secondary hope here is that someone in the accessibility community might be willing to fund me so that I can continue producing tools like this and extend Audiograph into a completed project. I will have to produce some of them for myself at some point. I'd rather make money for it and end up with something useful to others as opposed to doing it for free and ending up with the bare minimum I need for my projects. While this post's primary purpose is to give people a reason to fund Libaudioverse, I think it also demonstrates that I have the qualifications for this kind of work.

Some Examples

If you can't play these demos for some reason, you can get all of them plus a few more as a zip.

You'll probably want to wear headphones for these. Equations are in the format you need to enter them into Audiograph; ** is exponentiation.

First, the line y = x with the x range set as 0 <= x <= 10 and the y range set to the same:

If we move the x range to -3 <= x <= 3 then we can get a nice parabola, specifically x**2:

Audiograph supports ticking at specific multiples of x and y. It also ticks when functions cross y = 0. This is a feature I have not seen in any other graphing calculator for the blind, and I've had at least one other person make the same observation. These features are off by default.

If we set the x range to 0 <= x <= 13 and turn both on, we can graph sin(x) and have exactly two periods. Audiograph is configured with the following thre commands for this example: .0ticks on, .xticks 3.14, and .duration 10.

One thing that often comes up in computer science is big-O notation. Three complexity classes are particularly common: O(n), O(ln n), and O(n ln n). We have O(n) above as the line, because that's all it is. Here are examples of O(ln n) and O(n ln n), with audiograph best configured to demonstrate what is going on.

First, O(ln n). This was done with 0 <= x <= 10. ln n grows so slowly that widening the range is almost meaningless. The equation entered into Audiograph is ln(x). We enable y ticks at multiples of 1 and turn x ticks off entirely. Notice how long it takes to get to 2, as compared to how long it takes to get to 3.

For O(n ln n), it is useful to widen the range to 0 <= x <= 40 and 0 <= y <= 100. All other settings are the same.

I also produced all of these demos with HRTF enabled. By far the best (but still lame) is the sin(x) example from above. See below for a discussion of why HRTF works so poorly here; the short version is that this is the worst possible demo for it and Libaudioverse's HRTF needs improvements I haven't had time for. HRTF versions of all demos are available in the zip file linked above.

I don't demonstrate it here, but undefined values, complex-valued expressions, or expressions we otherwise can't evaluate at all points (for example x**0.5 for negative x) have said values replaced with pink noise.

It's A Prototype: Known Issues, Limitations, And Needed Improvements In Libaudioverse And Audiograph

Let's cover Audiograph specifically, and get them out of the way before we move to Libaudioverse:

It's command line. It can't do graphics or provide better/more intuitive UIs via a command line interface. Continuing it needs a GUI.

The internal architecture is incapable of being adapted to other graph types. We can't do polar without a rewrite, for example. Essentially what is needed internally is some of the components I'd have written if I'd gotten the Holman Prize. Unfortunately I didn't make it to the semifinals, so I won't be getting it. If this continues, I'll produce some of the most critical pieces from what I'd have done for Holman as part of it.

It might not run on some systems, particularly Windows 7. This is because of something quite esoteric that's not my fault. Currently this doesn't seem to be a problem. I made sure to get someone to try it on Windows 7 and Windows 8. But if it isn't running for you and you're not on Windows 10, get in touch. If this is a problem, it's one I can't fix without someone who has the problem to test things for me.

Now, Libaudioverse:

The oscillators used by Audiograph are additive. Put another way, the oscillators used by Audiograph are incredibly slow compared to what they could be. Since Libaudioverse itself is capable of synthesizing in multiple threads, it should hopefully be really hard for someone to have a computer that is insufficiently powerful. But many blind people can't upgrade their computers regularly, so I'm almost sure someone will. This needs to be optimized, especially since the "real" version will want to use more oscillators simultaneously. The irony here is that I tried to do this a long time ago but ended up in a place where I needed graphing tools to finish.

The noise nodes that we use can also be made something like ten times faster with a relatively trivial optimization that occurred to me after I started writing this blog post. That's the way of programming, though.

And finally, we could benefit from three additional nodes.

The first is called a stereo widener. Most panning is done with only volume changes. But there's a number of tricks that don't yet exist in Libaudioverse that can make it feel like your headphones are bigger, in a hard to describe way. The upshot is that small panning changes are potentially more noticeable. With simple XY plotting this isn't such a big deal, but it could be useful for polar and complex-valued graphing. This is available in tools like Reaper and Audacity as plug-ins.

The second is a stereo-to-binaural filter. The name is misleading. It's not HRTF or 3D audio. Instead, it takes dry stereo or mono signals and makes them feel like they're outside your head a bit, along the lines of putting you in a virtual room, but without being a reverb and with something like a hundredth of the processing requirements.

And the third is a panner that pans based off values between -1 and 1. Currently Audiograph is using a panner that works off angles and wants speaker maps. It works, but it's annoying.

On Why HRTF Doesn't Work

There are two big points to take away here, plus a bunch of smaller stuff.

3D audio is amazing for games. But it's not very precise in the vertical direction and it's very hard to make it work when you aren't simulating a realistic 3D environment. Even if it worked perfectly and sounded amazing in this application, we'd still have to do everything else, and I'd bet most people wouldn't bother to turn it on. Graphing calculators aren't about sounding amazing, they're about conveying information accurately. Simulating it as a plane in front of you might sound amazing, but Graphing is not about being cool, it is about being functional.

The other big point is this: Libaudioverse is something like 15000 lines of code. Less than 1000 lines of code are related to HRTF. But I didn't market it heavily myself and most people aren't developers. So what everyone "knows" about Libaudioverse is that it's the 3D audio library. This is incredibly inaccurate. Libaudioverse is not the 3D audio library. Libaudioverse is the audio synthesis library that can or will be able to do everything you could ever want for realtime synthesis, including really good 3D audio. But I am backed into a corner now. Anything I demo that doesn't somehow use HRTF counts against me to people who aren't programmers.

So. That's all well and good, but why does the HRTF demo sound awful? It should at least sound okay, right?

The first thing is that we're running pure tones through it with no reverb. That's the perfect recipe for hearing the artifacts of a filter amazingly clearly. When you use something less pure and with a richer spectrum, the artifacts of the filters become less pronounced. Add on reverb and they're covered up pretty much completely.

the other thing is that I need to convert the HRIR to minimum phase, diffuse field equalize it, and apply the interaural time delay using a spherical or elliptic model of the head. I haven't done it because it is more important to me to get Libaudioverse out the door even if it is less than perfect, and also because it's very mathematical and I've only just now reached the point where the explanation in the previous sentence is something I think I understand fully and can implement easily. I'm trying to avoid going into experimental territory this close to a useful initial release as much as is feasible. I will probably write another post on the general state of Libaudioverse that breaks this down more, but this is quite long already and a full explanation would increase the length of this post by half. Suffice it to say that these improvements should completely remove the artifacts here and also make vertical HRTF work at least twice as well as it does now.

Running From Source And Duplicating The Demos

Here is the Audiograph repository.

Create a Python virtual environment of at least 3.5. Run the following two commands in it:

py -m pip install --pre libaudioverse
py -m pip install sympy

Then clone the repository and run py audiograph.py.

Audiograph has basic batch and file writing support. The demos were produced by running .batch demos.txt from the Audiograph prompt and waiting. This will produce a bunch of stereo .wav files in the current directory.