Linux Audio Control Topology

This is a pretty raw extract from our intranet,   so has more detail of AudioScience APIs (HPI,ASX) than the others.  However, we are looking at how and what to describe in a future API. I’m posting it here in the context of the current discussion about ALSA control topology

ALSA

Controls are identified by name, and possibly index. (though most drivers only use index=0)

Control values are an array of the same datatype, or an enumerated set of strings.

No topology information in the standard API. There are exceptions: HDA driver reveals NID data via a special file. ASoC driver contain topology information to assist with DAPM (Dynamic Audio Power Management) but AFAIK it is not exported to userspace.

Adding more topology info is being requested and discussed http://thread.gmane.org/gmane.linux.alsa.devel/62416 http://thread.gmane.org/gmane.linux.alsa.devel/52498

JACK

Applications provide N output
ports and/or M input ports. Arbitrary connection from any output to any
input. Multiple connections to a single input are summed.

Doesn’t address within-application controls. Each application provides its own UI.

ALSA can be considered as a privileged client application which
provides in and out ports corresponding to soundcard channels, and
provides the master timebase for all other apps.

Cobranet

SNMP

Bundle addressing, connectivity determined by receivers (as long as potential transmitter exists for multicast or broadcast)

Within bundles, channels determine audio content.

IEC62379

SNMP

Connectivity separate from function.

Blocks with ports

Mathematics

Network theory

Physical analogy

Patch cables connect things

Knobs and buttons control things.

The web

RDF http://www.w3.org/TR/REC-rdf-syntax/

HTML pages link to other pages. pages can contain data, and controls

LV2

http://lv2plug.in/ LV2 is a standard for plugins and matching host applications, mainly targeted at audio processing and generation.

I.e. it addresses objects that process audio and have controls

All control and data connects to ports of the plugin.

OSC

controls are identified by addresses that look like paths “/channel/1/fader”

an OSC receiver has an IP address and a port number

Intel HDA codec

The codec contains ‘widgets’.

Each widget has a numeric NID (?Node ID?). NID#0 refers to the overall codec.

Widgets have zero or one outputs. zero..N inputs.

Connectivity information: The list of NIDs that connect to its inputs can be read from each widget.

There is a ‘function group’ widget that acts as a container for
others. Containers contain a set of sequentially numbered widgets, can
query start index and count. NID#1 is audio function group.

Other widget types are Audio widgets: input, output, mixer, selector, pin complex, power, volume knob.

There is a set of Verbs that act on the widgets (ie. commands or queries).

OS-X Audio Units

“An audio unit (often abbreviated as AU in header files and
elsewhere) is a Mac OS X plug-in that enhances digital audio
applications such as Logic Pro and GarageBand.
You can also use audio units to build audio features into your own
application. Programmatically, an audio unit is packaged as a bundle
and configured as a component as defined by the Mac OS X Component
Manager.

At a deeper level, and depending on your viewpoint, an audio unit is one of two very different things.

From the inside—as seen by an audio unit developer—an audio unit
is executable implementation code within a standard plug-in API. The
API is standard so that any application designed to work with audio
units will know how to use yours. The API is defined by the Audio Unit
Specification.

An audio unit developer can add the ability for users or
applications to control an audio unit in real time through the audio
unit parameter mechanism. Parameters are self-describing; their values
and capabilities are visible to applications that use audio units.

From the outside—as seen from an application that uses the
audio unit—an audio unit is just its plug-in API. This plug-in API lets
applications query an audio unit about its particular features, defined
by the audio unit developer as parameters and properties.”

http://developer.apple.com/documentation/MusicAudio/Conceptual/AudioUnitProgrammingGuide/Introduction/Introduction.html
http://developer.apple.com/documentation/MusicAudio/Conceptual/AudioUnitProgrammingGuide/TheAudioUnit/TheAudioUnit.html#//apple_ref/doc/uid/TP40003278-CH12-SW1
http://developer.apple.com/documentation/MusicAudio/Conceptual/CoreAudioOverview/WhatsinCoreAudio/WhatsinCoreAudio.html#//apple_ref/doc/uid/TP40003577-CH4-SW6

GStreamer

GStreamer is a framework for creating streaming media applications. The
fundamental design comes from the video pipeline at Oregon Graduate
Institute, as well as some ideas from DirectShow.

The framework is based on plugins that will provide the various codec
and other functionality. The plugins can be linked and arranged in a
pipeline. This pipeline defines the flow of the data. Pipelines can
also be edited with a GUI editor and saved as XML so that pipeline
libraries can be made with a minimum of effort.

http://gstreamer.freedesktop.org/data/doc/gstreamer/head/manual/html/index.html

Windows

Wave

kMixer?

Possible new HPI

(Moved here from under HPI heading, leaving that to discuss current implementation)

A basic topology will have just controls and connections. I.e. the controls ARE the nodes.

Controls modify or measure the signal going through them.
Special cases are sources and sinks which have no input or output
respectively. Meters can be represented as either passthrough or input
only.
Controls have labels which sort of correspond to HPI nodes, but are
unique per control. I.e. nodes don’t have multiple controls. The unique
ID could just be the control index.

A single control can have multiple attributes eg. tuner has band and frequency,

Connections always have source and destination that are controls.

A simplified example (ASX like)

Controls:

1 Player1
2 PlayerMeter1
3 PlayerSRC1
4 PlayerVolume1
5 Player2 (leave out meter etc for brevity)
6 MixGain11
7 MixGain12
8 MixGain21
9 MixGain22
10 Sum1
11 Sum2
12 LineoutLevel1
13 LineoutLevel2
14 Lineout1
15 Lineout2
16 OutMeter1

Connections:

1->2   Player1 - PlayerMeter1
1->3   Player1 - PlayerSRC1
3->4   PlayerSRC1 - PlayerVolume1
4->6   PlayerVolume1 - MixGain11
4->7   PlayerVolume1 - MixGain12
5->8   Player2 - MixGain21
5->9   Player2 - MixGain22
6->10  MixGain11 - Sum1
8->10  MixGain12 - Sum2
7->11  MixGain21 - Sum1
9->11  MixGain22 - Sum2
10->12 Sum1 - LineoutLevel1
12->14 Sum2 - LineoutLevel2
11->13 LineoutLevel1 - Lineout1
13->14 LineoutLevel2 - Lineout2
10->16 Sum1 -> OutMeter1

AGE comments

If we are rewriting the way controls are handled I want to see control type:

ABSTRACT_CONTROL
with property of control type
with a connects to list
with a parent/child property
parent controls would contain a list of child controls
child controls would have basic types like "int", "string", "multiplexer"

The goal would be that once we have a multiplexer implemented
ONCE in ASIControl, any other controls that expose that property would
be automatically implemented. There would be no additional custom
coding.

HPI

HPI concepts

node
a place with a type and index. Nodes are either source or destination, not both.
control
active element with a number of
attributes (read-only or read/write). Attached to a single node, or
between a source and destination
attribute
setting or measurement value of a
control. Some attributes have more than one dimension (multiplexer-like
attributes, which have values of [node type, index]

Commentary

Having source and destination nodes means HPI cannot express many
topologies accurately. I.e. it is not possible to have a chain of
nodes.

Having volume controls “on” single nodes doesn’t reflect a
volume control having an input on one side and an output on the other.

Summing is implicit in when multiple volume controls attach to a single destination node.

Multiplexer controls are only attached to a single node, representing their output. In the case of linein mux, this node is a source node, even though it is the output of the mux. Connections of mux inputs are implicit.

Linein analog/digital muxes have ‘linein’ as both a source and destination!?

Node types vs control types. There is no direct connection
between the two. I.e. can attach a tuner control to an outstream node.

Attributes with 2 disparate parts to the value don’t map easily to a single basic datatype. (maybe string?)

HPI meters, levels and volumes are implicitly mono or stereo. HPI message doesn’t support more channels.

HPI controls provide grouping for attributes, by having multiple attributes attached to a single control

EWB critique

I think the idea of nodes and controls is overly complicated, and at the same time too inflexible.

Inflexible because it

  • it only allows a single layer network : source node – destination node.
  • nodes are either source or destination

complicated because

  • nodes have multiple controls,
  • the order of control application is hidden (apart from the ‘enumerate in signal flow order hack’),
  • some controls are ‘on’ nodes, others are between nodes.
  • For many controls, the nodes vs control is redundant eg tuner control/node, mic control/node,

Delio’s attempt at visualizing it all

(Insert object diagram here)

Reading the digram from the top-left:

A single ASX subsystem ‘controls’ many adapters. Each adapters
has multiple modes, each mode consists of a different collection of
controls.

Each control can optionally send/receive data to/from one or
more controls. Each control has zero or more ‘properties’ that can be
either read only or read write. Controls can be grouped into a control
group. A control group can have a number of read only properties used
to describe how to present its controls to the user.

In addition to the comments above I think that we need also to
figure out what it means exactly for control to be connected to
another. As I see it there are two data channels in and out of any
control. The sample stream channel and property access channel (think
of it as in band and out of band data). The sample stream channel
carries the data to be processed by the control; the property access
channel carries parameter change commands. For instance: an autofader
control could be implemented simply as a control that emits
“set-volume” commands to a volume control. The relevant topology
snippet would look like:

    [control] --> volume -> [control]
       autofader----^

In the graph above the horizontal flow is the sample stream
(processed through the volume control) while the vertical arrow is the
command stream from the autofader control. The autofader control does
not process samples but simply automatically updates the level property
of the volume control.

The same could be implemented as an autofading volume control:
a single control that supports autofade. The advantage of a model that
separates the data and command channels is that it is more modular.
Simpler controls can be designed and connected together rather then
having to add features to existing controls.

ASX

Similar to HPI in concept. Adds player and recorders as controls (loosely map to HPI streams).

About these ads

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s