Sound Synthesis

Setting Up

For this activity we'll need some way to transport an electrical signal from the board into a speaker. The easiest way to do so is to use a speaker module such as the M5 Atomic SPK Base with an M5 Atom Lite (not S3!) board, but you can also just attach a little speaker or a minijack connector to any board's DAC pin. In the case of the ESP32, the DAC is available on pin 25. Note that only the original ESP32 has a built-in DAC.

The connection is very straightforward. The positive lead should connect to the DAC pin, and the negative lead to ground.

A Little Bit of Theory

What is Sound?

We hear sound when moving air reaches our ears. Sound is, put simply, air in movement. We are submerged in a large volume of fluid -air- that can be disturbed when certain objects move in it. Those disturbances propagate through the fluid and may eventually reach a super sensitive membrane inside our ears, which in turn generates electrical signals that are proportional to the magnitude of those disturbances, eventually reaching our brain and being interpreted as sounds.

Instruments constitute a very special category of sound-making objects. All harmonic instruments -that is, pretty much all instruments that can make different notes- vibrate in such a manner that their sound can be described by the combination of a bunch of sine waves.

What Is a Sound Wave?

A sound wave is nothing but a succession of sound pressure levels. Just like a stone thrown into a body of water creates a series of ripples on its surface, a sound-generating object creates a series of ripples in the air. We can't see them like we can see water ripples, but we can sense them with our ears or, if the sound is extremely loud and low-pitched, even with our whole bodies.

A sine wave is the particular kind of wave that you'll see when throwing a stone into a lake. It's a wave that repeatedly and smoothly transitions between a high and a low peak:

A sine wave, as we've said before, is also the kind of wave that harmonic instruments generate. With harmonic instruments, though, you usually get the sum of a few sine waves at the same time, just like you would if you threw a few stones into a lake all at the same time.

Electronic Sound Devices

As is often the case, technology takes inspiration from nature. Microphones work in a very similar way to our ears. A thin, super sensitive membrane receives air vibrations and those vibrations are converted to electrical voltage levels by an electromagnet. These are then transmitted to whichever audio equipment we want.

Incidentally, in the case of microphones, the process is also reversible. A speaker is just a microphone in reverse. When we send electrical voltage levels to a speaker, an electromagnet moves a membrane accordingly, and that results in air vibrations that reproduce the same sound that originated those electrical signals.

Electronic Music

With the invention of the speaker, for the first time in human history, we were free to generate any kind of wave we wanted, not just those that can be produced by the vibration of natural materials. That is, we didn't need to pluck a string, or hit a membrane, or force air into a pipe. We could sculpt a sound wave in as arbitrary a shape as we wanted.

The speaker, and whatever signal generating contraption we attach to it, has given rise to sounds that humanity had never heard before. Electronic music is music that makes use of this new way of generating sound waves.

So, if we want to make some electronic music, we better start looking at how to generate a few different waves!

Generating a Simple Wave

As we've seen, an electrical wave is just a succession of voltages. The easiest succession of voltages we can think of is just one that abruptly changes from one value to a different one at a fixed interval.

Since our speaker is connected to pin 25, the following code should change the voltage level in that pin alternatively every 3 milliseconds:

If all goes well, you should hear a sound similar to this one:

This is what we call a square wave, and if you open the audio file in a sound editor and zoom in, you'll see exactly why it receives that name:

The physical world is never perfect, but this wave does approximate 90º angles at every transition from low to high and from high to low voltage levels. These square angles are what give this wave its name.

Let's now take a look at the two parameters that we can tweak in our code, and listen to what effect they cause in the resulting square wave. The first one is the interval at which the values change from low to high. Let's try doubling it, from 3 milliseconds to 6:

Increasing the interval resulted in a lower pitched sound. In particular, doubling the interval made the sound one octave lower.

Inspecting the wave also illustrates something very interesting:

So, apparently, the horizontal component of a wave determines its pitch. In wave terminology, we call this component the frequency of the wave. The higher the frequency, the more often the wave repeats itself, and the higher the pitch is.

Let's now tweak the other parameter: the value that we assign to the pin. We started with 512, so let's now try to reduce it in half:

The resulting audio is quieter, and its wave shape also illustrates that:

So, we can see how the vertical component of a wave determines its volume. In wave terminology, we refer to this component as the amplitude of the wave. The higher the amplitude, the more vertically stretched the wave is, and the higher the volume is.

The Analog Synthesis Library

Generating waves by directly setting values into the analog write block is an interesting exercise and it shows how voltage level fluctuations directly affect the sound that is generated, but this gets tedious quickly.

That is why MicroBlocks provides a library that abstracts a whole lot of concepts so we can focus on shaping the sound and creating music. Let's begin by importing the Analog Synthesis library from the Sound folder in the library browser.

Wave Shapes

The Analog Synthesis library provides blocks for the 5 classic foundational sound waves: sine, square, triangle, sawtooth and noise. You should now spend a few minutes experimenting with the different wave shapes and seeing how they sound. Try tweaking their frequencies -first parameter- and amplitudes -second one-, and see how that affects their pitch and volume. Note that the Analog Synthesis library provides a very wide amplitude range -0 to 32768- so that you have enough resolution to mix several waves together.

The next test worth trying is adding a few waves together. The simplest way to add two waves together is to just add them arithmetically using the + block:

At some point, though, you'll be adding lots of waves together and the amount of nested + blocks is going to quickly become hard to manage. That's why we suggest you use the sum block, which you can also find in the Operators category, at the bottom of the second block group.

You can try different ratios between frequencies and amplitudes and see what they do to the final timbre of the sound. You can also try using a wave to modulate the frequency or the amplitude of another wave:

Gates and triggers

Now that you've been experimenting with designing sound textures, it would be nice to be able to use these as instruments where you can play patterns and melodies.

The Analog Synthesis library takes inspiration in physical synthesizers, where sound generators are "locked" behind a gate, and you can trigger those gates to open on command, either by pressing a key on the keyboard, by sending a specific signal to the synthesizer, or by some other means.

The library provides a few default gated generators, one for each basic wave.

Let's test one of these.

Now, if you run this script, nothing will happen. That's because gates are closed until something triggers them. To trigger a gate, you just run the trigger block.

Notice how the trigger block expects a MIDI key parameter. Simply put, MIDI is a standard that pretty much all electronic instruments understand. MIDI notes are specified as numbers, where 60 is middle C on a piano (usually C3) and a unit represents a semitone.

⋯ MIDI note numbers laid out on a piano keyboard ⋯

You could now use a list of numbers to define a melody and play it through a gate:

Envelopes

Another concept that the Analog Synthesis library borrows from synthesizers is the idea of an envelope. An envelope is, put shortly, the description of how long should it take for a gate to open -attack-, how long should it stay open -hold-, and how long should it take for it to close -release-.

Envelope types

There are many different types of envelopes, but the Analog Synthesis library uses the simplest of them all: linear Attack-Hold-Release (AHR for short). Other systems may add more stages to their envelopes, or they may let you define a shape -a function- for some or each of the stages.

To define both the amplitude and envelope for a particular gate, we can use the following block:

It is now a good idea to experiment with different values for attack, hold and release to familiarize yourself with what each one does to the final sound.

Synthesizing Simple Instruments

Xylophone

With all these tools, we can already create crude approximations of a few acoustic instruments. For instance, we know that a xylophone must have a very short attack. When we hit a slab, the sound meets the peak amplitude pretty much instantly. The note then starts decaying almost immediately, without holding its volume for too long. With this information, and knowing that all harmonic instruments produce sine wave sounds, we can set up a xylophone with the following values:

If we look at the wave shape for one of the notes in this audio, we'll see what the envelope did to the sine wave:

The wave is still sine-shaped, but the envelope gave it an amplitude modulation that's consistent with our idea of how a xylophone sound works.

Flute

A flute, on the other hand, will take a little bit to start vibrating at full volume. Thus, the attack time will be somewhat long. We can then hold the note for as long as we want, or as long as our lungs allow. This should mean that we can choose among several different values for the hold time. When we stop blowing, the sound decays rather quickly, but not abruptly. The release time, accordingly, should be relatively short but not too much.

Something like this should do it:

The wave shape also looks like what we'd expect from a flute sound:

Synthesizing Not-So-Simple Instruments

Something doesn't quite add up, though. Let's think of the envelopes for a piano and a violin.

When we press a piano key, a little hammer is activated, hiting a string pretty much instantly. Then, the sound starts decaying almost instantly without holding its volume for too long.

On the other hand, when we draw the bow over a violin string, the sound gradually rises until reaching its max volume. We can hold the volume for as long as we're rubbing the string, and then the sound decays rather quickly, but not abruptly.

If these sound familiar, it's because they describe the exact same envelopes that we used for the xylophone and the flute. Obviously, though, a piano and a xylophone sound distinctly different, as do a flute and a violin.

So, what gives these instrument distinct sounds?

Timbre

The timbre of an instrument is what gives it its particular quality; what makes us distinguish two instruments even if they're playing the exact same note at the exact same volume. Envelope is part of what defines timbre, but there's another equally important physical characteristic: the frequency spectrum.

When we play a piano key the hammer does hit a single string, but that vibration causes many other strings to also vibrate sympathetically at frequencies that are a multiple higher or lower than the particular note we played. What's more, the string that we hit doesn't vibrate in a simple sinusoidal manner. Instead, it equally vibrates with a mixture of frequencies that are also a whole multiple lower or higher than the fundamental one.

These extra frequencies are what, in music theory, we call overtones.

We've hinted at it, but it's important to stress that the frequencies of these overtones always hold a relationship of some whole multiple with the main note. This means that, if we hit an A4 note, which has a frequency of 440Hz, its overtones are going to be 880, 1760, 3520, 7040, etc. Some of these overtones are going to be so faint that they'll contribute nothing to the timbre, and others may be almost as loud as the fundamental note. For some instruments, the overtones can also go in the other direction, lower pitched. In this case, they would be 110, 55, 27.5, etc.

The volumes and frequencies of these overtones, in relation to the main note, define the frequency spectrum of an instrument and, together with the envelope, give each instrument its particular recognizable sound.

Not so simple!

Some instruments will present a different spectrum depending on the fundamental note. Some other instruments, like bagpipes, will have what we call a bourdon note that plays constantly in the background. Others will mix a hint of noise into their wave shape. We're picking instruments that are somewhat canonical so that we can illustrate the basics of sound synthesis, but the field is pretty much infinite!

Approximating a Piano

Even though a flute and a xylophone also feature a bunch of overtones, these weigh relatively less than the envelope in the definition of their timbre, which is why we could approximate their sound so easily. That's not the case for the piano, where the overtones are absolutely necessary if we want to approximate its sound.

In a piano, the loudest frequency is going to be the fundamental, followed by the second and third overtones, each pretty much at half the volume of the previous. There's also hints of further overtones, but for now we'll stay with these.

Let's now define a new block that will return the sum of all these frequencies for a particular gate. In the My Blocks category, click on Add a reporter block and call it piano gate. This will create a block definition in the scripting area.

Next, add a parameter to the block by clicking on the ▶ expansion arrow next to the block name. The default name for the new parameter is foo, but we can rename it to something more meaningful by right-clicking on it and clicking on rename. Something like gate would be appropriate.

After that, we'll need to find out what the current frequency and amplitude are for this gate, and store those into local variables. Let's name those appropriately, and use the amplitude for gate and frequency for gate blocks from the Analog Synthesis library to set their values.

I can't find the local variable block!

Make sure you're in advanced mode by activating it in the settings menu (gear icon), otherwise you won't find the intialize local var block in the Variables category.

Let's now add the main three different waves that compose the frequency spectrum of the piano timbre:

A sine wave with the fundamental frequency and full amplitude
A sine wave with double the fundamental frequency and half the amplitude
A sine wave with four times the fundamental frequency and a fourth of the amplitude

Finally, let's set the exact same envelope we used for the xylophone to gate 1, constantly output the piano block -also at gate 1-, and play some melody:

If you had an early electronic piano, such as the Casio PT-1, this will totally bring back memories! It's not the most realistic piano sound, but it's a pretty valid approximation.

Looking at the audio wave we can see how adding all these overtones has shaped the sound into something rather different from a simple sine wave:

Approximating a Violin

Now that we know the drill, let's make a violin block. The overtones for the violin are as follows:

The fundamental frequency at full amplitude
The first overtone at almost full amplitude
The second, third and fourth overtones at ¾ amplitude

The violin has many, many more overtones, but these should suffice for our approximation.

And this is what the same melody, played on the violin, sounds like:

The sound is kind of screechy and ragged, but it does approximate a violin, at least a retro synthesized one. For completion purposes, here's what the violin sound wave looks like:

Percussive Instruments

When it comes to percussion, we're going to distinguish between two types of instruments: harmonic and inharmonic.

Harmonic instruments are those based on sine waves, where overtones are whole multiples of the fundamental frequency. As we've said before, pretty much all acoustic instruments that can produce different notes are harmonic. So are drums that produce sound with a vibrating flexible membrane.

In inharmonic instruments, we may get random waveforms, or maybe sine waves with overtones that do not conform to different octaves of the fundamental note. These include things like cymbals, snare drums or chimes.

Kick Drum

When a kick drum is hit, its membrane vibrates at full amplitude pretty much instantly, and then starts decaying also rather quickly. We can approximate its wave shape with a simple low frequency sine wave, since drums tend to be quite low pitched.

Let's try it:

This isn't totally wrong, but it sounds more like a bass guitar than a drum. The reason is that we're just applying an envelope to the drum sound, but, as the vibration dampens, its frequency also changes, vibrating less and less quickly as the sound progresses. In other words, a drum starts at full amplitude and high frequency, and then both its amplitude and frequency get gradually lower and lower until the sound goes silent and extremely low-pitched.

This is what the frequency ramp block lets us simulate. We'll set up a ramp for gate 1 that tells it to start at full frequency and end at, say, 30% of its initial frequency.

Snare

A snare drum, on the other hand, is inharmonic. We can approximate it by applying an envelope to a noise gate, and triggering it at a pretty high note. To be fair, the snare drum does also have a membrane that vibrates harmonically, but the springs at the bottom pretty much cancel out any sine waves that may have been there in the first place.

Hi-hat

A hi-hat is also inharmonic and can be approximated with a noise gate, too, but it has a much shorter release than a snare drum and it's not as loud.

New Electronic Instruments

Up to now we've been trying to approximate sounds that existing acoustic instruments produce, but nothing stops us from inventing new instruments that make use of different wave shapes, envelopes and frequency spectrums.

Here's one that I made up:

Music Composition and Live Coding

Now that you know how to create your own instruments and sounds, you may want to put this into practice to make some music, maybe even live and in front of an audience.

MicroBlocks offers a few libraries that are useful in that regard, in particular Scales & Chords and Rhythm. To learn more about these, I'd suggest taking a look at the Live music with MIDI devices activity, but instead of sending out MIDI messages to an external synthesizer, you can route MIDI notes to an Analog Synthesis gate.

Here's a simple example of a synth melody, a kick drum, and a hi-hat playing at the same time:

The only thing left to do now is to have fun synthesizing your own music from thin air, quite literally!

Boards:

Components: