AM(C)A with Michael Carpenter

sbrijmohan admin
edited December 2021 in Community - Events

Bring your questions about MC – and Depeche Mode – to MC's AMA at 10 AM Pacific on December 10!

AM(C)A. Michael Carpenter (a.k.a MC) is a telecom API lifer who has been making phones ring with software since 2001. As a Product Manager for Programmable Voice, he's at the intersection of APIs, SIP, WebRTC, and mobile SDKs. MC understands the importance of call behavior, how you can measure performance, improve call quality, and optimize engagement by leveraging Voice Insights.

If you could ask him a question about his work or this topic, what would you say? Please drop a comment with your question below!

We will keep this thread open to submit your early questions until the day of the AMA.



  • sbrijmohan
    sbrijmohan admin
    edited December 2021

    Community! As we approach our AM(C)A, MC wanted to say hello and better introduce himself to you all. Check out the video and comment your questions below.

    We will be closing this thread Thursday (12/9) up until the AMA, now is your chance to get your early questions in!

  • Hi MC! I've been trying to learn more about Twilio Products and I'm not too familiar with Voice Insights. Do you mind going into more detail about what Voice Insight is and how did it start?

  • mcarpenter
    edited December 2021

    Hi! Thanks for the question! Voice Insights is a big data analytics platform that gathers events, metrics, and call metadata and performs analysis, summarization, and aggregation on that data for our customers. Voice Insights started as an internal tool for Twilio's Super Network, Support, and Engineering teams but we knew this data was too useful to sit on so we've exposed a subset of the data to our customers in the form of the Voice Insights product. Pretty much we realized that half of our support tickets were getting resolved by us sending screenshots of our internal tools to customers and they would be like "woah wait what is that where do I see that?" so we put part of the information up in Console starting with just JS SDK WebRTC calls, and slowly over time we added support for other call types, and when we went GA in 2019 we made 85% of the product free. Since then we've been working on expanding the reach of Voice Insights into things like subaccount rollups for ISVs and Conference Insights for customers who use our conference product to orchestrate their call flows.

  • Hi Mc, I'm really interested in building more with Voice Insights. What are some best practices for implementing Voice SDK events in applications?

  • MC, Your beard is EPIC! Kudos. Let's talk more about your beard journey, how long did it take you to grow your beard?

  • I love hearing different Twilion stories in how they found they home at Twilio. MC, can you talk more about how your journey with Twilio started?

  • Let's talk about SIGNAL!! What is your favorite developer experience from SIGNAL?

  • Hey Liz! One common thing that trips people up is that you don't need to have Voice Insights Advanced Features enabled on your account to utilize the events that are in the SDK. What Advanced Features brings to the party is the availability of those events via API, Console, and Event Streams, but you can still use the events in your app to notify users of potential issues.

    The most important thing is to give users the heads up that something might be happening so they can dynamically respond; imaging walking around the house with your laptop and you hit a dead spot in the wifi and now your connection is trash, if the apps you're using don't notify you of that you have no idea, but if the user gets notified "hey looks like your network just took a nosedive" they can adjust their behavior; e.g. move back into a spot with better wifi, yell at their roommate to stop streaming 4K Netflix, whatever.

    The other key thing lots of developers leave out is capturing subjective feedback. It's one thing for us to capture all these metrics in Voice Insights and present them to you, but the rubber really hits the road when you can tie that Voice Insights data back to subjective experience scores and you get a real sense for what individual users thresholds are for certain behaviors. You might find some users are super chill about latency but flip their wig at the first crackle of jitter, whereas others are reporting dead air or dropped calls for calls that haven't even been established yet. Keeping an active eye on what the subjective experience of your users is, at scale, is what separates the good from the excellent in my experience.

  • Thanks! I haven't been beardless in probably 20 years (I super hate shaving) but as I recall I went from baby cheeks to full blown Grizzly Adams in about 8 months. It also helps that I come from a long line of hirsute gentlemen and could pretty much grow my beard as it exists today by the time I was in junior high. Some people have 5 o'clock shadow. I have 9am shadow.

  • MC, so good to see you on the Forum! I'm curious to hear how "Who Hung Up" determined?

  • I was at AT&T working on their developer platform, specifically their WebRTC implementation, and needed to write the docs so I went looking for other companies doing WebRTC so I could peep their docs (great artists steal) and found Twilio's. I remember thinking at the time "dang these folks really know what they are doing" and very rapidly realized that Twilio was going to eat AT&T's lunch in the telecom API space (spoiler alert: I was right). A little bit later a former coworker who I admired very much joined Twilio and hooked me up with a referral so I jumped ship at the first available opportunity.

  • OK Go did a really cool thing where everyone at SIGNAL dialed into a conference, put their phone on speaker, and the song OK Go were playing on stage played through everyone's speaker phone. They had to do some really complicated timing maneuvers to get their live performance on stage to synchronize with what people were hearing from their phones so it looked a little odd, and doesn't translate great on the video because the stage mics didn't pick up the speaker phone output well, but as a telecom nerd knowing how hard it was to pull off I was really impressed. Of course they give zero effs about street cred from me, but they were no joke, that was super tricky.

  • Under the hood we're all SIP all the time, so it's based on the SIP dialog and specifically who sent the SIP BYE. If the BYE came from the calling party we will mark it as caller, if the BYE came from the called party we will mark it as callee.

    There are sometimes race conditions where you can see conflicting reports of who hung up on two sides of the same call that are typically caused by both parties hanging up at the "same time" and each SIP signaling edge receives its BYE and reports on it.

  • Hey MC, What is something you think everyone should do at least once in their lives?

  • Oooooh go see a total solar eclipse. My wife and I traveled to Oregon to be in the totality for the eclipse in 2017 and it was like full-on, getting choked up, emotional awe-inspiring. There's another one coming up in 2024 in North America and we already booked our travel plans. Don't sleep on it!

  • How does Twilio silence detection work? 

  • Also, what are the differences between Programmable Voice and Voice Insights? 

  • Today our silence detection is pretty severe; it's basically looking for either a missing RTP stream when one is expected (one is not always) or the absence of frequency and amplitude data in the RTP that was received; aka pure digital silence. In practice this means that users may detect silence on a call where Twilio does not because of things like hissing or buzzing or low amplitude background noise. Since we're not doing audio analysis it also means that contextually appropriate instances of silence are tagged; e.g. someone joining a conference call and parking on mute the whole time or someone answering an expected call with a one-time password. In context the silence makes sense, but it is still silence so we will tag it as such, so it's important to understand expected behavior.

  • The way I think about it is: Programmable Voice is the ability to create, receive, and manage calls using Twilio's REST API, TwiML language, Voice SDKs, and SIP interfaces. Voice Insights is metadata about those calls; i.e. events, metrics, and summary records that describe what happened, when, to whom, for how long, and how bad it got.

  • Hey MC, do you have any favorite Twilio Memories you'd like to share?

  • When working with Voice Insights are there any audio quality issues can Voice Insights not detect?

  • I love talking about food! haha

    Are there any weird food combinations you enjoy and would like to put us onto?

  • Hey MC, can Voice Insights tell me why my voice sounds so much worse on a recording than in my head? HA

  • Also, I've also been curious about the secrets of the universe, and how can I access them with Programmable Voice?

    But seriously, Is this data available using REST APIs? I would love to experiment more with it!

  • I have been at Twilio for six and a half years so I have sooo many.

    When I flew down to the old HQ on Harrison to interview I was coming from AT&T where I was one of 437K employees, which was something you got reminded of every single day in the most soul crushing ways, so I'm coming from just kind of the worst possible faceless multinational corporate culture experience. As the elevator door opened the person who greeted me at the front desk had punky hair and tattoos on their hands and the drink fridge behind them had a sign on it that said "For Guests don't be a turd" and before I even had a chance to open my mouth to introduce myself someone comes running down the hall full-speed screaming "f**k!" at the top of their lungs (which I believe was an early in-building form of PagerDuty).

    I then proceeded to get just grilled to the max, just like straight up torn to shreds by an interview panel of the five smartest people I'd ever talked to in my whole life, and I was like "these are serious brainiacs doing crazy difficult work on kind of unimaginably hard problems", but everyone was humble and authentic and loved working at Twilio and seemed to be having a genuinely great time, and I breathed a sigh of relief and relaxed because I knew I had finally found my people.

    Joining Twilio was like hitting a reset button my whole life up to that point and I think that first moment walking in the door really encapsulates the experience for me, and even though it was before I was actually hired, I think about that fondly and frequently.

  • Hey MC, can you talk more about how insights are gathered? Also, are there any requirements to use Voice Insights?

  • cdennis
    edited December 2021

    Also, word on the street is you own your own winery? Is this true and when can I come to visit?🙃

  • Let's ask the question everyone is thinking... Cake or Pie?

  • Newman's Own sandwich cookies dipped in sour cream. It sounds wild but if you think about Oreos and Greek yogurt (Greek yogurt not being worlds away from sour cream) it stops sounding so crazy? To me at least.

    Oreo's will work in a pinch, but if you haven't had Newman's Own sandwich cookies because you suspect they would be crappy knockoffs let me disabuse you of this notion for they are instead superior to their Nabisco-birthed forebear in every conceivable dimension.

  • As someone who is just starting their career in tech, I'm interested to hear if you have any advice or if you could share any challenges you came across in your career. I know you have been in the game for quite some time.

  • Only if the reason the recordings sounds worse is due to jitter or packet loss, but believe me I empathize. Bolstered by the internal reflections and vibrations as it rattles around in the cavities of my skull my perception lends my voice a booming, sonorous quality, laden with depth and gravitas. Imagine my horror when confronted with the braying and brassy goose-honk I hear played back to me in recordings.

    RealTalk™️: Twilio's post-call recording processing routine applies a ~100ms jitter buffer to the stream before writing the file, so technically the recording will sound better than the live call for some jittery calls.

  • It is indeed. Call metrics and events are available near real time, and call summary records start being assembled within a couple minutes of the call ending. Peep the docs for the deets.

    REST APIs are cool and all but the new hotness is Event Streams where you can subscribe to events and we will just write them directly to a Kinesis stream and land them in your infrastructure. A much more elegant solution than trying to hit a bunch of REST endpoints after every call and trying to correlate that data with webhook callbacks you received.

    Some secrets of the universe working in Voice uncovered for me include: no two events have ever occurred at the same, we as observers are just not perceiving them with sufficient granularity; and though it seems arcane and esoteric the speed of light creates a floor for minimum latency on calls, and for calls that ping-pong all over the planet courtesy of our global telecommunications infrastructure that delay can be measured.

  • Hey MC, outside of your PM role do you have any skills you would like to master?

  • Also, Do you have any resources you recommend that can help me build with Voice Insights?

  • Things that are part of the audio signal itself like background noise or echo caused by high gain microphones paired with high amplitude speakers aren't really possible to detect without decoding the audio information itself and analyzing the waveform. At Twilio's scale where we're doing millions of calls every day that is a very pricey operation, so we stick to things that don't require decoding audio today.

  • MC, give the people what they want! Can you share any sneak peeks or future feature ideas with Voice Insights or Programmable voice?😎

  • I don't have a winery but I do own a vineyard where I grow Dijon clone 777 pinot noir grapes on 24 year old vines. I also have an apple and pear orchard. I am not licensed by the state to produce spirits so technically any alcohol I made would be considered moonshine and thus illegal to produce, possess, or consume so of course I would never. Stop by whenevs!

  • We ain't gatekeeping: Voice Insights is included with every Voice minute on the Twilio platform, out of the box.

    The way we gather data is we have sensors at every media hop in our network and in the Voice SDKs that record metrics, capture events, and upload them to the Voice Insights platform. For now, depending on the call flow you could see up to 622 different parameters captured for a given call.

  • I would say my favorite desserts tend to be things like galette des rois which, although it has cake in its name, is way closer to pie... so I guess I would go with pie?

    My beef with cake mostly stems from me not being a big frosting guy. Like if someone asked you "hey do you want to just straight up eat this stick of butter" you'd be like "ew, no" but stir a little confectioner's sugar and vanilla extract into it, smear it on a cake, and all of a sudden we're supposed to be on board? It just seems like it's rarely worth the calories to moi.

  • Hey MC, I'm curious about how you started working in tech. I feel like this might be a good story!

  • Outside of your journey, I was thinking about using Voice Insights but am wondering if there are any limitations of the Insights Dashboard to be aware of?

  • I've had a very specific journey (which sounds solipsistic because I guess everyone's journey is definitionally specific) so I don't know how applicable my experience would be to others, but... if I could go back and change one thing it would be to spend less time in the corporate monolith world.

    Working at big places means you get to work on big cool things, like at my previous employer we upgraded the entire national cellular network four or five times (2G -> GSM/GPRS -> EDGE -> UMTS (3G) -> HSDPA/HSUPA/HSPA+ -> LTE) and that scale is impressive and you can learn a crazy amount and be really proud of the work you've done. But being a needle in the haystack as an employee, if I may stretch the metaphor, means that you are a recipient of work, whereas at scrappy startups and smaller businesses you can be an active participant, choosing your destiny as it were.

    I appreciate the time I spent in Fortune 50 companies because without that experience I would never have gotten work someplace like Twilio, but looking back I wish I had bailed on the behemoth stuff earlier or at least spent more time working at multiple companies so my development could have been informed by multiple perspectives; i.e. I could have learned different ways to approach the same problems, etc. A lot of my professional development at Twilio has been digging myself out from the patterns that 14 years at one company creates. 0/10. Would not recommend.

  • How can we check external network status? What are the tools used (if any) for audio issues?

  • I would really like to learn how to read Russian at the highest literary level. Most of my favorite writers are dead Russians (Nabokov, Gogol, Dostoevsky, Tolstoy). The challenges of translation are fascinating to me, but I can only imagine the experiential difference there must be reading the OG text.

    I have much love for some dead Frenchies too (Proust, Camus, Flaubert) but just in terms of the world that a new language would open up Russian kind of dunks on French (which I already have some poorly-maintained fluency with thanks to a couple years of college French back in the day).

  • Why does high latency mean? And is there a difference between high roundup trip time and high latency?

  • Oh you betcha! Check out the SIGNAL talks on Voice Insights that available on YouTube those give a good Fisher-Price My First Voice Insights overview, and we also have a free training that is available on our learning platform if you want get into the nuts and bolts.

  • Total stroke of dumb luck and being in the right place at the right time really.

    I grew up in the scenic suburbs of Seattle which is right in Microsoft's backyard and they used to send reps to my high school computer club. For the youths™️, school computer clubs were a thing in the 1990s because not everyone had a computer at home if you can believe that, so you would sign up and get to hang out with other computer nerds after school and learn HTML or whatever.

    The Microsoft folks, who were dads and moms of kids I went to school with, would bring free software like demos of Flight Simulator and show us things like new Internet Explorer versions and stuff like that. At the end of the school year they would send recruiters to the computer club to talk to the graduating seniors; as I recall they were pretty much like "Hey, any of you nerds want jobs?" and I got hired doing phone support for VBA for Office at Microsoft basically straight out of high school.

  • The main limitation is that Voice Insights data is only stored for 30 days, so if you want to do stuff like look at things like month-over-month, quarter-over-quarter, or year-over-year you will want to consume the API or use Event Streams to pull that information into your systems for analysis.

    The Dashboard used to not support multi-dimensional filtering but we've addressed that. In fact we just released a new version of the filter builder recently so if you haven't checked out the Dashboard in a while I recommend swinging by and seeing what's new; you can accomplish some kind of incredible things with the new filters.

  • Absolutely!

    For Voice Insights you can expect more info on the conference side of things (hot dashboards ahoy) as well as anomaly detection and notifications based on Voice Insights call data; i.e. get notified when the number of failed calls to Canada increases by a certain amount.

    For Voice as a whole the sneak peek into transcriptions we gave at SIGNAL was really just the tip of the iceberg. Calls are transforming from a commodity into business intelligence and it's crazy exciting. Imagine being able to find every call where a customer was getting spicy about a policy in order to quantify the impact of a policy change. Today that is manual work humans have to do, but not for long!

This discussion has been closed.
If this is an emergency, please contact Twilio Support. This is not an official Support channel.
Have an urgent question?
Please contact Twilio Support. This is not an official Support channel.
Contact Support