*** MUS171 #20 03 10 |
Miller: @0000 We'll finish up with GEM today. This is an example of taking audio and graphics and in some sense, bridging the divide between them. And this is a very good patch | es: @0000 |
@0015 for you to load on your laptops while your professor drones away. [laughter] Basically, well, you can tell exactly what's happening. | es: @0015 |
So, as I was saying, GEM is all about polygons. @0030 And what you see here is a bunch of polygons. But so far, I've shown you two rectangles, so this is a whole sample with a whole mess. So what I want to do is just sort of show you how you put together something very simple like this. | es: @0030 |
@0045 And I've got to say, this is not going to win you any prizes being able to do this kind of thing. But the interesting stuff that I think that you can do with GEM, with geometric kind of modeling like this | es: @0045 |
@0060 is things where you algorithmically generate hundreds of thousands or millions of polygons to make shapes or other kinds of collections of stuff in space. | es: @0060 |
And, I don't have any good examples of that sitting here right now. So, I'm not going to @0075 try to develop one. Instead, I'm just going to show you this kind of very simple sort of generating example of "OK, here's how you would use audio analysis to drive a picture." And then, we'll just sort of leave GEM | es: @0075 |
@0090 for future years. And then I have five other topics in Pd that I'm not going to have time to do properly, so I want to mention that they exist and give each one of them maybe 10-ish minutes each. | es: @0090 |
@0105 That will be plenty for today. And then, we're going to be done for the quarter -- except of course for final projects. | es: @0105 |
So, the plan for today is not just one thing: It's a bunch of things but each of them is @0120 only just kind of on the surface. One is, I'm going to finish up with GEM with this example. And then, the example actually, has is an example using audio analysis and there are lots and lots of things that you can do with | es: @0120 |
@0135 analyzing sounds. | es: @0135 |
And so, I want to show you the basic tools that Pd has available for doing audio analysis and some of the things that you can do with that. And then, there are other things that you might want to know about which I will tell you about @0150 as I get to them; but, basically, four other classes of things. | es: @0150 |
Each of these are just sort of quick topics. "netreceive and netsend" is making network connections between computers. @0165 "readsf~ writesf~" is spooling audio to and from disc so that you can do things like make sound with Pd and have the sound file afterward. And, pd~ is a trick for doing Pd with multiprocessing. It's a way of having Pd sprout child | es: @0165 |
@0180 Pd's that can run on other processors. So, each of those things takes a few minutes to describe and then they're done. Fourier analysis and re-synthesis: I'm going to just show you one very simple example of this, but this is a thing which you could easily study for | es: @0180 |
@0195 months or years. In fact, it's true that most people have to study this and think they learn it and then come back a year later and realize that | es: @0195 |
@0210 there's actually another point of view on it that you wanted to know it from. And so on for three or four iterations before you really have enough points of view down that you really believe that you know Fourier analysis and synthesis fluently. | es: @0210 |
So, this is a thing that which I want to tell you the existence of, @0225 it will not start actually going on. Because it is just a big thing. | es: @0225 |
But, anyway, back to the GEM example. so what's happening? Every time I talk, the sound is going to the microphone. @0240 So, we're having to do two things. Really, we are, measuring the loudness of a sound. | es: @0240 |
That's a technique which is old and it dates back to the, at least to the analog synthesizer days, it's called "envelope following" and it's one of the three kinds of analysis that I am @0255 going to be showing you in some more detail next time. But at least now, what I'll do is show you how it works into this example. | es: @0255 |
So, envelope following. This is kind of a mess. I like actually running sound files @0270 because if you have a recording of your favorite politician giving a speech, this is a great way to listen to that. So, what I am going to do is try to figure out where all the stuff is... (Not there. | es: @0270 |
@0285 Come on. What am I doing wrong? Got that, got that. Audio in. Oh, "pd works" . Here we are. This is the real thing.) , so what does a bird consist of? | es: @0285 |
@0300 Last but not least is going to be "pd sound" where we are actually getting the loudness of the sound coming into the microphone. So, I'm going to save that because for continuity sake, I will just talk about the graphics first -- even though | es: @0300 |
@0315 the flow of information is from sound to graphics. So, I'm going to be going upstream in information flow here for a few minutes. | es: @0315 |
So, what would you do? Well, these things are all, @0330 well, these, I'll do this in some detail. This, these hundred rectangles are going to be the bird's body. So, part of the trick to drawing the bird is making this shape, | es: @0330 |
@0345 which is a very irregular shape that you can't make very simply out of polygons. So there is a thing for making a hundred polygons that makes that shape that I'll show you. | es: @0345 |
Then, there are isolated things which are... Let's see. @0360 This thing is a rectangle, this twig that it's sitting on. The legs are I believe trapezoids although I have to go check. And, these eyes are actually hexagons. That's cheesy, but | es: @0360 |
@0375 that's what I did. And this beak is three triangles. There are two triangles for the top part of it and just one triangle for the bottom. And, those triangles are the only thing in the whole thing that's moving; and in fact, the only thing that's moving in this, are | es: @0375 |
@0390 four points in space which are: | es: @0390 |
First off, the two sides of the beak which are [clapping noises 06:33] chosen at random whenever it finds a new attack. So that, basically, the width of the beak is sort of changing word by word @0405 except it's not working too great right now. And then the, you can tell that the two points in front of the beak -- which if there's no sound going at all are one point there. | es: @0405 |
There are two points @0420 which lie on a vertical segment and they're some fixed point plus and minus the envelope value that is coming in from the incoming sound. | es: @0420 |
So, these two are random, but are set off by attacks of sound. @0435 And these two are the continuously changing envelope. That's the whole deal. So, how do you do it? What I want to do is find some simple stuff first and then show you the complicated stuff. | es: @0435 |
@0450 Here is simple stuff: I told you that everything is triangles and then of course here, I'm making a four vertex polygon which is a | es: @0450 |
@0465 quadrilateral. If you make quadrilaterals in OpenGL or GEM, make sure that the four points are coplanar because it will do the wrong thing if you give it a skewed quadrilateral, | es: @0465 |
@0480 that's a quadrilateral whose vertices are not planar. In this case, almost everywhere, the Z value... <<video projector fails>> ... So you can all just look at my screen. [laughter] I don't know how this is going to work ... We'll see. | es: @0480 |
@0495 So the basic yoga of GEM is you're going to make things that have planar polygons. The easiest way to make a planar polygon is to make it a triangle because there's no way you will ever find to make a triangle not be planar. | es: @0495 |
@0510 If you make something four or six points then, it's more of work. But, the easy way then to make a polygon to be planar is if you just set all the Z coefficients to the same value; then you know it's on a plane, which is "Z = constant" . | es: @0510 |
Otherwise, you might have to work harder. Now, @0525 I'm just going to sort of talk into the air to tell you the hard part, which actually is getting the body of the bird, not the beak. The hard part is this: I went in and made a table which has two ... | es: @0525 |
... @0540 So, you have made arrays before. I might have shown you on one event or at one moment a graph that had two different arrays in it. So, I made a graph with two different arrays in it | es: @0540 |
@0555 and I drew the arrays to make the outline of the bird's body. So, it's not ... this wouldn't be true of any possible shape ... But the bird's body is actually designed in such a way that | es: @0555 |
@0570 every cross section of it -- Any time you intersect a vertical line with it, you just get a segment. You never get two different segments. In other words, the shape never does this. | es: @0570 |
The shape is always just from one point to another vertically. @0585 And so, you can describe that fully by just making two functions versus just say the contour of the top of the bird and the contour of the bottom of the bird. And, that was the only reason that I had the patience to stand to do this because I was able to think of a shape that this was true of. | es: @0585 |
@0600 So, then, what do you do? Well you make a whole bunch of rectangles. So, you saw although I didn't go into it, an abstraction that was called a Rect 10, there were ten of them at the left side of the screen. | es: @0600 |
Each one of those things is making @0615 10 rectangles... <<Brady Baker arrives with replacement projector>> Oh, cool! Oh! Even better. Ladies and gentlemen: Brady Baker! [applause] | es: @0615 |
Miller: . @0630 So, where were we? Oh yes. So I was telling you by waving my hands in the air ... The body of the bird goes from here to here and here it is: It is Pd tables and this | es: @0630 |
@0645 is my own artistry. So, I went to draw these two lines. And then how do you get those out? Well, you probably know, but, so, this is stupid: How do you make a hundred things in Pd if you don't want to make a hundred boxes? | es: @0645 |
@0660 You make 10 copies of an abstraction, each of which has 10 of the things. [laughter] Miller: And if you want a thousand of them, you know where to go. [laughter] | es: @0660 |
Miller: You know how to do that too now. So, the basic @0675 deal here is, we're going to tell this rectangle, oh, $1 $2 $3 means array1, array2 and 10, 20, 30, 40, blah, blah. So, what is happening here is, this is going to be "rect array1 array2 10 0". | es: @0675 |
@0690 And this will be same thing except "20 0". Andd so on like that. | es: @0690 |
So, we go in here ... So, now $3 is tens and $4 is units and you're going to add the two of them to get a number which will range from 10 to @0705 99 or something like that -- I'm not sure how it works. And, what are we going to do? Well, we are going to go reading four points out of the table which are: (Sorry, | es: @0705 |
@0720 let's get back and get the table.) | es: @0720 |
So, for each ... Let's see. I said "rectangles" but really, the strips that we're drawing are trapezoids because the bird's top and bottom are not necessarily @0735 parallel? And so, what we are going to do? We are going to drop four points, a planar polygon having four points in it. And the four points are going to be picked up by looking at two adjacent points up here and two adjacent points down there. | es: @0735 |
@0750 So then, how do we do that? First off, we get the data that we need which is tabread-ing array1 and array2. So, here, $1 is array1, $2 is array2, | es: @0750 |
@0765 $3 is 10, $4 is 0. | es: @0765 |
So, here's tabread array1 and tabread array2 and here is tabread array1 and array2, but there the locations that we're looking at are one point further. One point further than what? ... @0780 So, we are going to receive from somewhere upstairs a message which is just a sequence of bangs in time called "doit" . | es: @0780 |
Actually, every single time doit gets banged the same thing comes out because nothing's changing here. But maybe something will be changing later. @0795 We want to make the bird to do something funny, right? So, this is a good way to just sort of compute a shape in such a way that, for instance, for every single frame that we want to draw in GEM we might want to re-compute the shape. | es: @0795 |
Bangs come in @0810 and now we compute what number we are and we do that by adding $3 and $4. Trigger bang, bang, get out $4, get out $3 and add them. And then we take that number and throw it in five different places. | es: @0810 |
...Well, six different places if you count this. @0825 So, we need to do six things: We look up four points out of the tables and we need to figure out the... -- that's four Y coordinates and then we need X coordinates and there will be two separate X coordinates which are the X location of the | es: @0825 |
@0840 left hand side of the trapezoid and the X location of the right. So, it is X1, X2 and the Y1, Y2, Y3, Y4, right? | es: @0840 |
So, I told you about getting the Y values out of the table. The X's are just, take the X value, whatever it is @0855 and fudge. So, in general, when you want to change something's range, you multiply by something and/or add something. In this case, I subtracted 50, so that it goes from -40 up to 50 I believe. | es: @0855 |
@0870 I'm not sure about that, up to 49? ... And then divide by 20 which is to say: get into GEM units which are from -3 to +3 if you're on the Z=0 plane -- And that's just sort of knowledge that | es: @0870 |
@0885 I have that you have to go from -3 to +3 -- That's the size of things in GEM unless you change it. | es: @0885 |
So here, this thing has a range of about 100, so when we divide by 20, it has a range of about 5 which is to say, most of the way across the screen -- which is what you see @0900 in the bird. And so, we just take the number and then we take it +2 +2 ... yeah. Whatever. | es: @0900 |
And...that's strange. Why wouldn't that @0915 be +1? So, the numbers that come in here are all the numbers from 10 to 109 I think actually in increments of one and what this is saying is that the rectangles are actually overlapping slightly because | es: @0915 |
@0930 it's going from 10 to 12 and then from 11 to 13 and so on like that. Go figure. That's what I had to do to make it work right. OpenGL... And then, here are points. Points are | es: @0930 |
@0945 three coordinates apiece and ... | es: @0945 |
Jolly us, the Z coordinate is always 0 because we are flat on the Z=0 plane and X and Y are just being computed, well the Y values are coming out of the tables, 1, 2, 3, four and the X values are coming out of these computations. @0960 And those are the vertices of this trapezoid which is going to be a polygon with four points. | es: @0960 |
And, the four points are specified using triples -- packed triples each. And here is point 1, 2, three and 4. @0975 You can make a polygon with as many points as you want but do make it planar. In fact, make it convex. Why? It says in the OpenGL, you can only make convex polygons. | es: @0975 |
@0990 And then, meanwhile, now, what you saw last time was simple GEM examples where there was basically a gemhead object ... blah, blah, blah, blah, until finally you get down to a thing which you have to draw which is a polygon of some kind. | es: @0990 |
And here, @1005 gemhead set the color to black which is that and then: Draw a polygon. So, the GEM chain if you like that's to say the sequence of events which happens in this window every time it renders a frame, is head, | es: @1005 |
@1020 set color and draw a polygon. And there are a hundred of these. | es: @1020 |
So, that is this body. And now, I am just going to go fishing around and @1035 find the beak. I believe this is the beak down here and yeah, let's see. Can I scroll in in such a way as to make it easier to see this? Oh. | es: @1035 |
@1050 These are the eyes... | es: @1050 |
@1065 I told you they were hexagons. I just did this in my head and figured out the points that are the coordinates of a regular hexagon. If you studied trig in high school and you haven't forgotten it, you can do that. | es: @1065 |
@1080 So these are two hexagons with centers ... They should be getting added but...Oh, I see. I am using "translate" to move the eyes over to where you want them and in fact, just for testing sake, and this is by the way how I actually designed this: | es: @1080 |
@1095 I fixed it so that the translate actually had a nice message box on it so I could figure out where I wanted the eyes to be and then when I got the eyes where I wanted them to be, I copied the numbers into it. -- | es: @1095 |
@1110 Seat of the pants. | es: @1110 |
Now, the beak. That's the only part that's actually doing anything animated ... And I have lost the beak ... These are the two legs. @1125 Where is the beak? The beak is probably going to have to be in that sound ... Come here. Scroll. Oh, beak! | es: @1125 |
@1140 Duh. Beak, here's is a beak. [laughter] | es: @1140 |
This is where you start to realize that Pd makes a lousy programming language. This, if you were a programmer would be two lines of code, @1155 but since we're in Pd land, it's not two lines of code, it's a whole messy page of stream of consciousness. | es: @1155 |
The good thing is you can actually program Pd by stream of consciousness which you can't really do in C. The bad thing is of course @1170 then you have to explain it. [laughter] -- It looks like stream of consciousness. | es: @1170 |
-- Which is what it is. So, here's the deal. Beaks consist of three polygons and there should be a pair of two of them that share vertices. @1185 Where are they? (Drat. I've got to make this smaller, so I can start scrolling. There. Good enough.) There are the three polygons. | es: @1185 |
Now, @1200 what you don't see is: This number here is coming in from here and this is the number that goes a up ["huh" microphone input] when you make noise with the mic and goes down when you don't. That, | es: @1200 |
@1215 we're going to look at later. That is the envelope follower. | es: @1215 |
And notice again, I've messed around and tried to get this thing to have some reasonable range of values. So, now ... @1230 this is going to be the bottom triangle of the beak. I can tell because this number is going to be used in two different | es: @1230 |
@1245 ways depending on where we're at. | es: @1245 |
So, what's happening here is, there's a floating point number which is going into the second which is the Y coordinate of this polygon. The X coordinate, that's to say the first value, is 0 @1260 which means the beak is right in the middle of the window. Is that true? It's not true. Oh, there is probably a translate object. Oh, yeah. Here we go. Here is translation again. This is me figuring out where to put the beak.[laughter] | es: @1260 |
@1275 This is great.[laughter] | es: @1275 |
Let's put it back. Oh yeah, right. So, we translate it. So, since we are using the translation, we can think that everything as being centered around 0 which is what's happening down here. @1290 So, all we're doing is we're packing something that has a Y value which is the envelope value times -1 (... but what's happening here? ...) Oh, I'm sorry. This is the other thing. This is the width of the beak | es: @1290 |
@1305 which is this number. So, that is getting randomized every time there is an attack [clapping] ... | es: @1305 |
... which I will show you later. So, that is something called "w" for width and then the other thing is "h" for height and that's getting computed @1320 elsewhere. | es: @1320 |
There are three points of the polygon which are: the left corner, the right corner and the bottom point of the beak. So, here's the left corner which is at 0. Here's the right corner, ... ooh, sorry, that's the right corner @1335 at zero. Here's the left corner at "0 -.1 0" ... And then, that number is changing when this changes. ... | es: @1335 |
@1350 Why can't I get a good number there? There! | es: @1350 |
@1365 I got the beak back... So, I'm going to stop improvising now and just tell you that after a while, you'll make all sorts of mistakes and then at some point, you'll get all those points right and then you'll have a nice looking beak. | es: @1365 |
@1380 Now, what I do want to do is go back and show you where these numbers come from because that's the connection with the audio. And so those are numbers which are h << and w >> ... This is received height and received width. And those | es: @1380 |
@1395 numbers are being computed as a function of the sound. So, cool object: | es: @1395 |
The envelope follower. This is a thing that I can sort to tell you what it does -- It is taking the signal @1410 and squaring it and then putting it through a low pass filter. ... Yeah? Audience: Is the audio input coming from the mic? | es: @1410 |
Miller: Yeah. This is the mic. So, if I turn it off, now you are seeing the noise floor of my audio system. And now you are seeing @1425 noise floor of the room. And now, you're seeing the noise of me talking which is a little bit louder than that. | es: @1425 |
This is in decibels and @1440 a reasonable dynamic range of something that's happening depending on what it is. It could be 30 to 60 decibels. So, when you're just sort of talking, it's varying like 20 or 30 decibels. | es: @1440 |
@1455 If you have a trained singer singing something, it's going to be more like 60 dB or 60 decibels because they use dynamics a lot harder than regular people talking do. So, what we're going to do is take that and figure out a good way | es: @1455 |
@1470 to change its units into something useful. | es: @1470 |
Useful units are things that range maybe from 0 to 1-ish which is what's going on here. How do you get it to range from 0 to 1? Well, you just basically @1485 fudge until something good happens. And, in this case, what the fudge consisted of ... let's assume that the noise floor is going to be 30 dB. | es: @1485 |
Why? Because it turns out that a wide variety of rooms the noise floor is somewhere in the 20s of dBs when you set @1500 a reasonable volume level. -- It's just a good number. And so, what you do is you take this and subtract out what you think the noise floor is so that you'll get a number that is positive when something's happening above the noise floor and negative when it's just noise. | es: @1500 |
And then, @1515 you use this wonderful binary operation "max" which is maximum. So, if you ask for the maximum of 0 on something, then when it's positive, the maximum is the number and when it's negative, the maximum is 0. So, this is a way of clipping the thing below by 0. | es: @1515 |
@1530 And then, this is where your hand really starts seriously to wave: What should be the transfer function by which the amount of decibels in excess of 30 | es: @1530 |
@1545 turns into the height of the beak? The answer might not be just plug the number of decibels straight into the height. I discovered that it's better to have loud noises have a proportionally | es: @1545 |
@1560 larger affect than quiet ones. | es: @1560 |
So, I ended up deciding that the right thing to do is to square the number. So that for instance, 10 dB turns into 100 but 20 dB turns into 400. And, @1575 it should therefore be true that as I get louder, the change in the beak gets more pronounced. It is not really true though... | es: @1575 |
Anyway, if you don't square it, @1590 then you get that the mouth is just sort of always sitting kind of open and doing a little bit of this stuff and you really want it to just open and shut and it turns out the way to do that is to square the envelope. Square the number in dB... Go figure. | es: @1590 |
In fact, I'll show you @1605 how it can be lame -- I'll just not square it. And then of course, I'll have to multiply it by something different. Let's actually get a new one. So, we'll just say $f1 times something small. | es: @1605 |
@1620 So, 5 ten-thousandths, let's try this number. Here we are. That's kind of lame. Right? | es: @1620 |
@1635 And, so, if you don't square the envelope, you either have that kind of lameness or you have the beak always shut kind of lameness but you don't get a decent range of openness and shutness. And I don't know how to explain any better than that. | es: @1635 |
And furthermore, I don't believe I even explained @1650 the "expr" object, did I? This is your object for making C-like mathematical expressions where $f1 means the floating point number coming in inlet 1 and $f2 is the floating point number coming in inlet 2 and so on. So, | es: @1650 |
@1665 if you wanted to do this with just regular objects, you have to do a "trigger float float" so you could multiply the thing by itself and then you'd have a separate object to multiply it by. So, this is replacing three objects with one expr. | es: @1665 |
expr was written by Shahrokh Yadegari @1680 who teaches in the Theater Department here. ... Yeah? Audience: What's that "env~ 4096" ? | es: @1680 |
Miller: Oh, yeah. Thank you. I ran roughshod @1695 over one important detail. So, envelope followers are (classically anyway) and envelope follower is: take the signal and rectify it somehow such as for instance square it, and then run it through a low pass filter. | es: @1695 |
This particular envelope follower is better than that. What it does @1710 is it looks at a certain window of the signal and multiplies it by a smoothing function and then measures the total power within that window. | es: @1710 |
This number that you give it is a power of two which is the size of the window that it analyzes. So, @1725 here, I decided that I wanted it to analyze a tenth of a second at once -- basically, because a tenth of the second is large enough to hold a syllable or to hold an utterance of some sort. | es: @1725 |
@1740 Why is this a tenth of a second? It's in samples and we're at 44K1 sample rate -- so 4096 as about a tenth of a second. If you make this smaller, which you can -- | es: @1740 |
@1755 Like let's make it 256. Then it starts moving. It starts outputting stuff real fast and then, "Hello, hello," that's still working just fine. Never mind that. ... | es: @1755 |
@1770 I was expecting you to like for instance, if I make a steady sound with a low pitch it might get to the point that sometimes the envelope falls between peaks of the thing and gives you a smaller number and sometimes it falls on a peak and gives you a bigger number. | es: @1770 |
@1785 And so if I give it 100 Hertz or so, [ "Aaahhhh" sound] it is not very steady, is it? [laughter] | es: @1785 |
"Aaahhhh," Like that? So now ... If you tell the envelope to look at more samples at once, @1800 do the same thing and it gives you much stabler looking result. "Aaahhhh," I think. "Aaahhhh." Yeah. It's better. | es: @1800 |
So, in general, this is actually @1815 easy to understand with envelope following, but it's in general kind of true in audio analysis that the larger a window of data you look at, the stabler -- the less ragged -- the output is going to be, or the less quickly changing the output is going to be. | es: @1815 |
@1830 So, I just threw it a large number for that reason. And yes, env~ -- | es: @1830 |
-- for technical reasons only likes powers of 2. It's ugly. If you give it some other value, it will just change it to a power of @1845 two for you but of course, if you just give it the power of 2, then does exactly what you said and you can see what it's doing, which is better. | es: @1845 |
So, I haven't told you a whole lot about env~ . I'm just sort of letting you @1860 enjoy what it does. And so, now, what I'm going to do ... Actually env~ I've told you just about everything you need to know about it. Although I will just pull it out in its own right. Actually, everyone's tired of the bird now, right? Can I get rid of the bird? [laughter] | es: @1860 |
Audience: @1875 Only if you give it a name. Miller: What would its name be? Audience: [laughter] | es: @1875 |
Well I'll you know, keep the bird around. I don't think it will hurt us to have the bird out. @1890 The only thing is I should try miniaturize this window. Audience: Are you going to save this patch? ... | es: @1890 |
Miller: Yeah. This is...@1905 you know, I made this some years ago. So, this is probably flying around the net already, but I'll stick it up on my site too. This is a very useful tool. [laughter] You might not believe it now. [laughter] | es: @1905 |
@1920 But, when you grow up, you'll realize that you need things like this. [laughter] | es: @1920 |
Envelope following: So, it's the usual thing that you can imagine. Take .. (oh, actually, @1935 let's not look at that. Let's look at a nice oscillator.) So we'll take an oscillator and I am going to have a number box to say what it is. | es: @1935 |
We will have an oscillator @1950 with a controllable frequency. And then we are to going say envelope. <<env~>> (Whoops, gosh, I'm not using the right key accelerators today.) And I'll, just for consistency's sake, let's say 4096 now. | es: @1950 |
And then, we will look at the result. @1965 Oh, we could look at the result using one of these ... And yeah -- | es: @1965 |
@1980 I'm not sure this is actually better pedagogy or not. | es: @1980 |
This is a slider. This is a thing which ... It's a nice graphical control that lets you do this kind of stuff, right. Well, you all probably found it already. So, gee, what's is happening? In is going @1995 this oscillator. It has a frequency of 0 and out is coming a value which I believe is going to be a hundred. Actually, we should look at it too: Ta-da. | es: @1995 |
So, the oscillator is putting out 1 as a signal, right, because it has 0 frequency @2010 and 1 is 100 dB. Why? Because it's arbitrary how loud 100 dB is chosen to be -- decibels are a relative scale, but in Pd, there is a convention that 100 dB is an amplitude of 1. | es: @2010 |
@2025 Now, we'll set the oscillator doing something. So, let's play A440. And then, lo and behold, the thing drops by almost exactly 3 dB ... well, anyway, 3-ish dB. | es: @2025 |
@2040 That's because when you start an oscillator going, it does hit 1 and does hit -1 -- but the average value in root-mean-square-land. ... (In other words, if you took it and averaged it in the way that you get | es: @2040 |
@2055 when you square the thing, average it and then take the square root -- that's called a root-mean-square, which is the good average for doing things in audio.) ... | es: @2055 |
The root-mean-square average of a sinusoid is the square root of 1/2; @2070 it's 0.707-ish. Or put it another way, 1/2 power is almost exactly three decibels. It is 3.02 decibels for people like me... And that is what you get here. | es: @2070 |
@2085 Hmm?! ... 3.01-ish -- actually, that's not quite right. But good enough. | es: @2085 |
Notice though that this value is exceedingly stable. It's nailing it to four decimal places and not varying at all. In fact, they even make thing fatter. @2100 Not that fat... Look at that -- rock solid. That's just too good to be true, right? | es: @2100 |
Now, @2115 let's start dropping the frequency. And now, you will see that the slower this thing oscillates, the less there is a tendency, ... | es: @2115 |
@2130 Well, so, what's really happening here? The envelope is running every, certain number of samples. Actually, it does an overlap of two. So, it is doing an analysis every 2048 points, but it's doing it on a window of 4096, | es: @2130 |
@2145 so everybody gets seen twice. | es: @2145 |
As you get fewer and fewer waves of the oscillator in the window, and depending on the phase of the oscillator right at the beginning of the window, you might get slightly different results. And as you slow the oscillator down, @2160 so that there are fewer and fewer waves -- so that there's less and less averaging going on from one part of the wave form to another, and you'll get more and more variation until at some point, at some horrendously low frequency, | es: @2160 |
@2175 at like a tenth say, you actually see the thing: What's happening now is the oscillator itself is taking 10 seconds to do a cycle and the analysis | es: @2175 |
@2190 period is only 1/10 of a second and so the analysis thing only sees one tiny portion of the elephant, so to speak. So now what we're seeing is highly variable. | es: @2190 |
In fact, this is not @2205 a good representation of the RMS loudness of this oscillator in some sense -- or maybe it is because ... What does that even mean when we're doing this? ... Questions? At any rate, it will turn | es: @2205 |
@2220 out that as soon as you get up to a frequency such that a whole cycle fits in the analysis window, so that will be 10 Hertz. | es: @2220 |
By the time you get up there, it is able more or less to give a good answer. So, in @2235 this case, with this big fat window, we can actually measure the loudnesses of oscillators all the way down to about 10 Hertz; but if we try to get down to something lower, it starts messing up. | es: @2235 |
@2250 And that is a trade-off with the size of the...it is another trade-off in fact with the size of this window, this analysis window. So, this is slow. This is only giving us ... Well, let me put it another way. | es: @2250 |
@2265 If you put something in with a sharp attack: The thing was off and the thing is now suddenly on. You might actually want to know right when that attack is in time. And so, you might want to have some time resolution in | es: @2265 |
@2280 the output of the envelope follower. The time resolution here is terrible. It's a tenth of a second. And so, if you gave it an attack, you would see the thing sort of decide there was nothing but then and over entire tenth of the second gradually decide that there actually, yes was a signal there. | es: @2280 |
@2295 So, if you made that window smaller, then it would give you a more and more high time resolution estimate of when that attack occurred. So, that would be a good reason for wanting to keep this window size small -- | es: @2295 |
@2310 so that you get good time resolution. But, of course, as the time resolution goes up, the frequency goes up that you have to get to, before the thing works. | es: @2310 |
So, again, if you @2325 decide that this thing is going to be a hundredth of a second long, then unless you have at least a hundred Hertz sinusoid, it sounds like the thing is getting louder and softer on that time scale. So, different time scales tell different stories. So here, | es: @2325 |
@2340 100-ish times per second is about 512 samples. It can't do 10 Hertz. | es: @2340 |
It can still do a 100-ish Hertz I think. Yeah. In fact, @2355 512 if I remember correctly, this is one 86th of a second. (Now, why do I know that? Because I am the sort of sick person who actually stays up all night working on these things and ... | es: @2355 |
@2370 that is a constant that comes over and over again.) So, I should be able to go down to about 86 Hertz and still get decently stable results but, when I drop below 86 Hertz, yeah, then it starts | es: @2370 |
@2385 going nuts. So, this number 86 is sort of the bottom frequency at which this gives me a stable result. | es: @2385 |
But, this is a nice fast envelope follower. It's an 86th of a second, so it will tell me when an attack is, to within whatever that is -- @2400 an 86th of a second time resolution; and that is a trade-off. | es: @2400 |
That trade-off isn't but you could sort of hand-wavingly call that the Heisenberg Uncertainty Principle. In fact, when you get @2415 into chapter nine of the book, you will actually really see the Heisenberg Uncertainty Principle. But, you'll have to do Fourier analysis to do it for real. | es: @2415 |
So, this is envelope following and how to choose this number depending on how low frequencies you want to be able to deal with. ... @2430 Questions about this? | es: @2430 |
Oh, out come decibels. Frequently, you want other units besides decibels like just RMS linear units @2445 and then you have all these wonderful conversion objects like dbtorms that you can use to fix that. | es: @2445 |
And, another little thing is -- as you have seen in other situations -- If you put nothing in, @2460 it considers nothing to be 0 dB. Although actually, nothing is minus infinity dB. It refuses to give you a number below 0. I do not know why, for some sort of sanity reasons. | es: @2460 |
Now, envelope following @2475 is great; but another thing that you might want to know about a signal is "What is the pitch of the signal?" So the sort of basic things that you talk about in music are pitch and loudness. Time of course, but time is just passing. | es: @2475 |
So, how do you get @2490 pitch? Well, the answer is you reach for different objects. So, I am just going to tell you what object it is. It is called sigmund~ . It's named | es: @2490 |
@2505 because it does analysis and out comes pitch and envelope. And the pitch is the interesting output for right now. It is in MIDI units. (And here, | es: @2505 |
@2520 this is stupid ... | es: @2520 |
This could have just been zero too but for some reason, I ended up thinking that you wanted to be able to deal with MIDI values that were below zero for describing vibrato speeds and things like that which are below MIDI 0. @2535 And so, this is really the smallest number that you can reasonably represent in pitch. <<-1500>> | es: @2535 |
This is the MIDI number for the smallest possible floating-point number, almost exactly. And here, you put your nice oscillator in @2550 and out comes a number which should ideally be the MIDI pitch -- not the frequency in Hertz -- corresponding to this frequency. | es: @2550 |
So, theoretically now, if I converted from @2565 MIDI to frequency, then I should be getting out the frequency of this oscillator. | es: @2565 |
@2580 And this is horrendously stable -- This is changing by 0.005 Hertz plus or minus | es: @2580 |
@2595 which is probably inaudible? | es: @2595 |
And, in fact, I am torturing it a little bit by giving at this very low frequency. If I give it something more reasonable like A 440 -- now, it is down to a part in @2610 a million. So, this is good, numerically accurate stuff going on. | es: @2610 |
Unfortunately, it's not true that real signals @2625 have periodicity. And so, when you give it a real signal, no one will ever know whether it's saying something accurate or not because what could you measure it against? -- Somebody else's notion of what the pitch should be. So, who knows what the accuracy of this thing is really. | es: @2625 |
@2640 But at any rate, to use it, you just for instance, run the analog to digital converter into it. And now you get ... When nothing is happening, it still gives you that and when you start giving it pitch | es: @2640 |
@2655 then you start seeing numbers come out; ["hello," in microphone] And of course, you know, speech is not singing, but speech still has pitch. | es: @2655 |
So, now, we can do the following horrible thing: (And @2670 by the way, this is another copy of the envelope just because it computes the envelope anyway as a by-product of what it has to do internally, and so it gives it to you.) | es: @2670 |
So, now, we will just take that and use it to drive a nice oscillator, why not? @2685 So let's say, the simplest way to make a nice sound is probably to take an oscillator and then to add something to it -- waveshape it, right? So, we'll add | es: @2685 |
@2700 something to it and then just take the cosine. | es: @2700 |
So, here, I am not going to dwell on it, but this is waveshaping as you saw it a few classes ago ... @2715 And then we'll say cosine ... And then, I want to listen to the output. | es: @2715 |
Oh, wait, I want to multiply it by something to control the amplitude. (Let's see, I'm not going to need this anymore now. @2730 Put this up where we can see it. Don't need that. And now, let's see.) | es: @2730 |
So, to control the amplitude, we will take this and multiply it by @2745 the output of a line~ -- all the good usual stuff. And now, I'll just take these frequencies...(Oh! -- I don't want to give it minus ... | es: @2745 |
@2760 I got rid of that other thing which is ... I need this thing in Hertz not in MIDI, right? | es: @2760 |
So, I did need that other, go away. OK, let's try this...) @2775 We are going to convert MIDI to frequency. And by the way, I don't like those 0 values, so what am I going to do is I'm going to say "Only | es: @2775 |
@2790 give me people who were at least something reasonable like 20 Hertz." | es: @2790 |
And, now, what's going to happen is when it actually gets .. (It's hearing my fan and stuff like that right now.) ... but, whenever @2805 it gets a pitch, you see it. And then, like here: ["aaahhhh" in microphone] and then when it doesn't have a pitch, it just freezes on whatever the previous pitch was. | es: @2805 |
So, this now will be our nice oscillator frequency. @2820 And now, what we're going to do is take this number here, convert it: So it's in dB, so we have to say dbtorms, and I'm going to sneak a look at and make sure | es: @2820 |
@2835 we've got something reasonable. Yep -- before I go playing it. ... So, Sigmund by default, is running at 86 Hertz which is every | es: @2835 |
@2850 11-ish milliseconds, 12-ish maybe. So, we are going to pack this with 12. You could measure that by the way, | es: @2850 |
@2865 but I am not going to get into it. And, now, we are ready to listen to the result ... | es: @2865 |
@2880 This could be good and it could be awful. ["Hello" in microphone] Yup.. [transformed voice] Wow, stupid. Why is ... Oh, I know: So that the bird would work, | es: @2880 |
@2895 I put a huge delay on the audio. | es: @2895 |
So, I've got to say this bird is computing at ten frames a second. So, I put about 100 millisecond delay on the audio so that the thing can think for a tenth of a second to try to compute these things. So now you hear this huge delay and if I --- @2910 I'll have to stop the bird rendering if I want to take the delay out. So, we will just live with the delay right now because everyone seems to like the bird, right? [laughter] | es: @2910 |
Miller: So, now, what you got is just me [transformed voice synthesis from microphone] ... turned into nothing. So, this is a nice @2925 voice-controlled synthesizer. So, you can...Oh, yeah, yeah, right. [laughter] | es: @2925 |
Yeah. That's actually a trombone with a mute. Yeah, but you could have done it this way. @2940 Yeah. So, that is a demonstration just of using pitch and amplitude to control some very simple synthesis voice. | es: @2940 |
@2955 About sigmund~: Pitch tracking is a much more complicated thing to do than envelope following. In envelope following, you just square things and add them up and you're happy. Pitch tracking, there are at least a thousand | es: @2955 |
@2970 papers on how to determine the pitch of an acoustic signa. and Sigmund happens to use one which I just sort of pulled out of a hat and works OK. | es: @2970 |
I don't have any proof that it works better than anyone else's. @2985 I think it works pretty well as these things go. And in fact, this is the third one I have written and it works better than first two I think for most signals. You can get to do all sorts of stuff. | es: @2985 |
@3000 In particular, it has to find the sinusoidal peaks that are present in the signals so that it can try to figure out the pitch. | es: @3000 |
Because you're basically using a frequency domain. And as a result, you can ask it to output not just the pitch but all the sinusoids @3015 it found; all the sinusoidal components. | es: @3015 |
And, you can catch those and then you can make a bank of oscillators that plays just sinusoids following the tracks and then you will get a re-synthesis of your sound and sinusoids -- which can be a very powerful thing to have because in, for instance, you can freeze @3030 a sound or morph it into some different kind of sound. | es: @3030 |
So, I am not going to show you how to do that because it would take some time to work it all up. But there's a lot of stuff that sigmund~ can do for you, that's @3045 worth looking at. And, the help window is scarily detailed. So, what I am going to do instead of doing that is move on to another thing. The other main analysis thing | es: @3045 |
@3060 that Pd comes with which is useful which is called bonk~ . This is the attack detector. | es: @3060 |
@3075 Yeah, so I mentioned although I did not dwell on it that this bird example actually has an attack detector in it. (ooh, the bird died. Oh, I turned the mic off. ["Hello."]) So, the way the | es: @3075 |
@3090 attack detection is happening here is very, very crude and I won't dwell on it. But basically, it is got a pair of thresholds and whenever the amplitude goes below a low threshold, the thing is off. And then whenever it goes back through the high threshold it's on. And there's a very simple | es: @3090 |
@3105 state-machine patch that does that, that you can go look up. It's a good thing to learn how to do, if you ever want to detect beginnings of things. That is the way. | es: @3105 |
bonk~ is a thing that is actually for detecting attacks with a very @3120 high time resolution. So, higher time resolution and then envelope follower could possibly give you. What it is, is a filter bank and there is a paper about it and all that kind of nonsense. But what it does is: You just run a signal in and by default it's hypersensitive. | es: @3120 |
@3135 So, rather than go mess with its parameters -- there are hundreds of parameters you can give it -- I am just going to cheat and take my incoming signal and multiply it by 1/30 th so that it is about right. | es: @3135 |
@3150 And now, I am going to go get a button just so it will flash whenever it gets happy. And now ... [Snapping fingers in microphone] | es: @3150 |
@3165 Oh, it's finding pitches there; that's funny ... don't know why that is. Anyway, now I have something that whenever I give in an attack... [snapping sound] ...It puts out a message. And this is a good thing if you want to do something like ... | es: @3165 |
@3180 Well, I won't insult your intelligences by doing this but, for instance, attach this to -- choose a random number and play a Risset bell tone or something like that. And then you have something where you can just do this [snapping noises] and out come bells or something like that. | es: @3180 |
@3195 And, that would be a very simple, you know sort of elementary thing that you could do with it. Here's a slightly less elementary thing. (I'm going to see if | es: @3195 |
@3210 I can get away with dropping the audio advance. I don't know if this is going to work or not because I want a smaller delay. So, it's still working OK,["hello" in mic]? | es: @3210 |
@3225 Sort of. And now, I can go down to 50 ?... 50! It's still good. <<In the "Audio Settings" panel, the delay is now set to 50 msec. >> We're happy. This is not instantaneous, but it's close, it's ...[snapping fingers in mic] ...it's fast enough that you can sort of pretend it's instantaneous.) | es: @3225 |
@3240 Now, what I'm going to do is go here and show you for instance how you can use this to measure time. So, I should have told you this before but there's a wonderful object called "timer" which | es: @3240 |
@3255 you use in the following way: | es: @3255 |
You click on the left inlet and it sets a stopwatch in a sense and then you click on the right and it reads it out. So, "bing, bing," @3270 one second. And, then if you whack it again, it tells you how many seconds still since this last one was done. So now, we just see rising numbers -- timer. | es: @3270 |
@3285 Timer is in a sense the opposite of metronome. Metronome you give it a time value and it generates events at desired times. The timer, you supply the events and it tells you what time was. So, those two things might be a good thing to put in concert. | es: @3285 |
@3300 In concert ...for instance, suppose we wanted only one button, let's do this: What I will do is I will just | es: @3300 |
@3315 measure the amount of time between two different hits of the same button. So, every time we get a bang from the button, we will read the time out and then we will reset the timer. And then I am going to check with that does for us. | es: @3315 |
So, now we have button ... @3330 And so now -- we just have incremental times. Great. And now, we could do something like -- Oh, this could be a good time for metronome. So, let's, we'll need a bang and float. | es: @3330 |
@3345 And then, we'll feed it to a metronome. (I'm running out of room. I'm going to want this. Let 's get this out of here. Maybe I should | es: @3345 |
@3360 shut this up now. ...) So, now we have already got a cool thing that will take events coming in...[snapping fingers] | es: @3360 |
@3375 and measure tempo. [snapping fingers] | es: @3375 |
So now, we have a way to measure tempo and have a @3390 synchronized drum machine, or metronome I guess. We'll have that be the time of the metronome and then we'll start the metronome whenever we get a measurement. | es: @3390 |
@3405 And then I will make that flash for now although I will do something more interesting in a moment. [clapping sound] | es: @3405 |
@3420 And so on like that.[laughter] [clapping sound] | es: @3420 |
Now, what if, for instance, every time we @3435 find out that there is an attack from block, we just record what the attack was? You could do this well but I am going to be this sloppily. | es: @3435 |
Sloppy is ... We make a table. @3450 let's make it 10 seconds worth and then we'll use tabwrite~ to record the table and we will just start recording whenever bonk~ says bang. And what we will record is | es: @3450 |
@3465 the audio coming in. | es: @3465 |
And then -- Every time this metronome goes whack we will just play the results of the table. And again, I will just be as low-tech as I possibly can. @3480 "tabplay~" ... And then we will get one of these to listen to it. Is this going to work? [snapping fingers] | es: @3480 |
@3495 We'll do this. Do this. [voice input in mic] | es: @3495 |
@3510 So, this isn't working because my voice is triggering it at crazy moments. ...[various voice sounds] That's actually not so bad. ... But what I really want to do is something like... | es: @3510 |
["Hello," "hello." and snapping fingers in mic] -- @3525 It's not working. Oh, right it's recording a new ... I have to do this: [voice sounds] | es: @3525 |
["Hello ... hello."] -- @3540 You get the idea. So now we have a ... How can I describe this? This is a tempo-driven sequencer. It is a thing that measures | es: @3540 |
@3555 the tempo of incoming events and loops at the same tempo as it's picking up. | es: @3555 |
Obviously, you are going to want ... Well, if you want this thing to work, you might have to practice having @3570 the thing not get set off accidentally. And figure out very carefully what your thresholds are, and where your mic is. Otherwise, someone is going to sneeze in the audience and set it off. And then you will be embarrassed. So, usual rules of show business apply. But in any rate, this is | es: @3570 |
@3585 a simple example of something that uses bonk~ for a non-trivial and interesting thing. ... OK, any questions about how this is working? Yeah? | es: @3585 |
Audience: Could you capture the sound @3600 and analyze it when there's a bonk? Miller: Oh, yeah, yeah. ... No -- So, bonk~ does a whole bunch of other stuff that I haven't described here. [laughter] | es: @3600 |
Miller: @3615 So, the original purpose to having bonk~ was to be able to actually transcribe a drum set in real time. So, it not only picks out attacks, but it also tries to classify instruments by their timbre. | es: @3615 |
@3630 And I'm not going to try to do it here right now because you need drum instruments and sticks to do this well. | es: @3630 |
You would need it to do a much better job of mic-ing and playing than I can do right now. But it will actually give you @3645 a number from 0 to N where you train it on N different kinds of incoming sounds. | es: @3645 |
And, that is what I was using it for in '97. Someone was playing a drum set in Portland and we are listening to it in New York. And what we are doing is @3660 transcribing the drum and then sending the transcription over a network. | es: @3660 |
Yeah ... that's a whole thing. Yeah, all this stuff has a lot of history. People have been doing computer music @3675 for 20, 30 years now ... No -- More than that -- since '57. | es: @3675 |
So, lots of this stuff goes a lot deeper than what I'm talking about. So, basic notions are: This is the @3690 stupid thing that you just reach for whenever you want to know how loud something is. These are the sophisticated things that take a little bit of computation time, but will do cool things like figure out pitch or rhythms/attacks. ... Yeah? | es: @3690 |
Audience: @3705 How did you? ... Like for example when you are triggering, the metro by snapping, you said you had to wait ... | es: @3705 |
Miller: @3720 Oh, right. So, what is happening is when you do this: [snapping fingers sound] | es: @3720 |
Then, what you get is the last thing. It doesn't play what happened between the two clicks, @3735 but what it plays is what happens after the second click. So, that's why to get it to whistle you do... [snap, snap, whistle] | es: @3735 |
... and there I was being careful @3750 not to do something that would re-trigger it -- Or else it would have done something else. | es: @3750 |
Yeah. And of course, if you don't keep changing it, it gets boring fast. @3765 Any questions about this? This is all just kind of...this is the very simplest kinds of, "Oh here's what you can do with audio analysis." Let me make one more good example or at least an example I like. -- It's probably not a good example. | es: @3765 |
@3780 So, let's go back to sigmund~ here which is giving us frequencies. And here we're already filtering out so that we only get the frequencies where it actually believes there is a pitch there. We're filtering out all the zeros. | es: @3780 |
@3795 So now, what would happen if we took our sound? So first off - sound. OK, ring modulation. Everyone knows about this. We will take a sound and multiply it by an oscillator. | es: @3795 |
@3810 (Ooh, we are running out of room. Put this over here. Try to remember about it later.) | es: @3810 |
So now, ring modulation: Take an incoming sound, multiply it @3825 by an oscillator. And I'll give the oscillator a frequency which I will control with a number box to start with. -- I already did this | es: @3825 |
@3840 quite a few weeks ago but now, we have a nice ["hello" in mic] ring modulator. | es: @3840 |
Now, ring modulation is a thing which gives you a nice harmonic result if you happen to hit the tone @3855 that it is -- Like a I think I can hit 90 Hertz: ["aaaaahhhh"]. And then it gives me something good. But if I give it a different pitch like ["aaahhhh"] then it gives me something inharmonic -- because that's how ring modulation works. | es: @3855 |
@3870 Well, what if you just decided to keep the thing harmonic by taking this pitch-tracked signal and making it the modulating sound? In fact, that's not going to be terribly interesting, but | es: @3870 |
@3885 let's do something a little destructive and multiply it by 20. (Oh, we don't need that tilde.) | es: @3885 |
So now, we're going to ring modulate by a frequency which is 20 times the measured frequency of my voice. @3900 In microphone: [And now, we have a wonderful...OK, so it's still pitched "aaaaahhhh."] Oh, it is bad. Yeah. I am having trouble because this bird is still running. "aaaaahhhh," But, so it's not continuous, but it | es: @3900 |
@3915 would be continuous if the bird weren't running right now. But the bird is important. So, we are going to keep the bird up.[laughter] | es: @3915 |
Actually, let's go back and, I'll tell it "You can be a little slower." @3930 And maybe we would can get away with this for a little while. ["Aaaahhhh."] Yeah. It hates itself, doesn't it? I'm just overloading the CPU right now: ["aaaaahhhh."] So, there's a thing ... that's pretty bad. | es: @3930 |
@3945 Another thing is you can always take any sound that you want and drop it by an octave by ring modulating it by half of its original frequency. This is the well-known effect: ["hello, and now you have me down an octave"] Except of course, | es: @3945 |
@3960 if you divide the frequency of the incoming sound, the fundamental frequency, by 2 and then ring modulate by it, then you will only get the odd harmonics. So, right now, what you hear is an [In mic: "odd harmonic sound | es: @3960 |
@3975 that is an octave below what you're saying."] | es: @3975 |
If you want it more natural, let's say a sound that's both odd and even harmonics, then you would take the ring modulated sound but you would then add in the @3990 non, un-modulated sound too. Like this:: | es: @3990 |
@4005 And now, you have [In mic: "... the good octave bi-coder"] | es: @4005 |
[In mic: "This sounds better in people with higher pitched voices @4020 than me"] But that's an all-purpose trick. In fact, that's plug-ins these days -- you just reach for a plug-in when you want that in your sound montage system. But this is basically how those things work more or less. | es: @4020 |
@4035 And now, you can do it real-time. | es: @4035 |
So, there's nothing funny about the chain. The chain is nothing but taking the adc~ and multiplying it by an oscillator. But the thing that's making it interesting now is the fact that we're using audio analysis @4050 to parameterize the thing that we're doing to the audio stream. | es: @4050 |
And you can do this with very small delays. There is a delay associated with trying to figure out the pitch of the signal -- which is on the order of the window size of the analysis -- @4065 which by default I think for sigmund~ is 1024 samples, but I'm not sure. | es: @4065 |
However, this signal chain that's going from the adc~ to the output doesn't have much of any delay at all. It's just going in and out. And so the thing is @4080 real-time in the sense of having a very small delay except that the pitch that it's modulating by is always going to be slightly out of sync with reality. -- Because the pitch determination is always going to be 10 or 20 milliseconds late. | es: @4080 |
Audience: @4095 Could you delay the microphone, too? | es: @4095 |
Miller: You could. And then you would maybe get a slightly better sound but you would also hear more delay -- so that would be a trade off. Yeah, and that could work better for voice but that wouldn't work terribly well @4110 for guitar or percussive sounds where delay is not good. | es: @4110 |
So, this is the menagerie of audio analysis stuff that's useful. And now, I am going to...(Oh yes, I am @4125 going to save it. Oh, and this is all in one patch. So this is going to be a glorious mess when you try to download it.)<<Saving "3.10/glorious.mess.pd">> [laughter] | es: @4125 |
@4140 And, now, what I want to do is go back and see ... | es: @4140 |
@4155 We're going to have to do some triage here because we got through this and then everything else is kind of not happening. So, I'm not going to show you netsend and net receive. | es: @4155 |
This is if you have two computers you can -- messages only; this will not work with audio. (Although you can find objects @4170 in Pd extended that will do this for audio too.) If you have two computers and if you know the IP address of computer number 2 (which unfortunately has to be an IPv4 address; it doesn't know IPv6.) | es: @4170 |
You can send messages from the one to the other and @4185 they're just Pd messages. Now how it works is the messages just get printed as ASCII and sent in network packets and the object to do that, it's like send and receive, it's called netsend and netreceive and you have to give it the IP address or name, host name, of the machine | es: @4185 |
@4200 you're sending to. Among other things, you can, you know ... Don't do that. [laughter] | es: @4200 |
You can, yeah, you can, If you're one of @4215 these people that likes to project your patch while you're playing it, you can put a netreceive up there and then people in the audience who are in the know can send you messages and change your sound. Audience: Do you have to give it a port? | es: @4215 |
Miller: @4230 Yeah. So, netreceive takes a "port number" -- which is IP language. You give it a number like 3000 which is just the number that it will address by. And then the network send which is on a different machine has to know the machine's IP address and port number | es: @4230 |
@4245 and that port number can be a way by which you can have netreceive's that are receiving streams of messages from several different places, using different ports. | es: @4245 |
readsf~ writesf~ ... So far, @4260 I've only told you how to record things into arrays which of course is then limited by how much memory you feel like giving the array and also, there's a bad that thing which is that when you read and write and array to disc, Pd itself grinds to a | es: @4260 |
@4275 halt while your computer synchronously reads or writes what might be several megabytes of data. | es: @4275 |
So, getting sound to and from disc is not -- using arrays and tables -- is not a real time activity. That's a thing that you would do @4290 before the show as you were setting up or before you start making sound anyway. | es: @4290 |
There's another thing, which is a pair of objects which actually spools sound to disc using separate threads so that it can happen in real time. And that's the objects called @4305 readsf~ and writesf~ -- And they are almost shockingly easy to use. | es: @4305 |
So, here for instance, let's just make... @4320 There's a nice oscillator. Here's a writesf~ object. And I'm going to give it the name of a sound file. I'm going to throw it in /tmp just to be ... No, | es: @4320 |
@4335 it's going to be more portable if I just put it right here. | es: @4335 |
Let's just writesf~ and then you can tell it how many channels you want but by default it's just one channel. -- You can make 100 channels if you want. Now, @4350 I'm going to have a message ... message please ... and the message says to open a sound file. And then you tell it ... | es: @4350 |
@4365 Once it's got something opened, there are messages to start and then to stop. | es: @4365 |
Actually, let's make this @4380 be the microphone again. This is not going to work perfectly because I have GEM running and so ... | es: @4380 |
@4395 So, here it is, we'll just say, "Hi this is a recording." Oh, yeah. And then of course, I didn't connect the start button and I didn't tell it to start which is to say I didn't do anything. (And also, I've got this other thing running which is going to get irritating. So, let's get that out of here. | es: @4395 |
@4410 And let's save this, it's going to be called "Record," and now maybe I can do this.) <<Saving "3.10/2.record.pd">> So, we open it and then we say | es: @4410 |
@4425 "start." [In mic: "this is a test. This is only a test."] And then, we say, "Stop" And now, we have a sound file. I 'm not going to try and play it for you because I would have to make another patch with a readsf~ object but it exists ... | es: @4425 |
Don't worry. But, if I went and looked now, @4440 I would find a nice file called soundf1.wav which would be a 16 bit PCM sound file with that thing that I just did. So now, you can save the wonderful sounds you're making in patches. It'll do 16 or 24 bit or 32 in fact | es: @4440 |
@4455 if you want to save floating-point. It'll do .wav or AIFF although floating point AIFF is a bit of a stretch I think. Yeah, and it is 10 'til. So, I am going to just not tell you all the other good stuff. | es: @4455 |
But pd~ -- @4470 if you have a multiprocessor -- will allow you to have all your processes run in separate Pd's and send signals back and forth between them in case you run out of gas. And yeah -- Show up, yeah, well show up obviously on Tuesday for final presentations | es: @4470 |
@4485 and also Tom Erbe will be teaching MUS 172 and he's wonderful. So, everybody show up for 172 and check that out. [applause] | es: @4485 |