Skip to content(if available)orjump to list(if available)

Show HN: A fully automated podcast – actually 12 podcasts

Show HN: A fully automated podcast – actually 12 podcasts

34 comments

·May 21, 2022

"That Horoscope Podcast - Aquarius" and it's eleven siblings - are daily podcasts that are end-to-end programatically generated e.g. scripted, voiced, post-produced and uploaded.

Would love to get some first impression feed back and hear how others would achive the same thing!

planetsprite

Very good idea and execution. I think you could do a lot more interesting stuff than making it about horoscopes.

Also as a turing test you should make one and never reveal it's entirely automated until it develops a big following. Due to the speed of automation you could mass produce podcasts of different types until one sticks, then the ones gaining more traction put 10x more resources into, etc.

holdenc137

I like your thinking :) I thought horoscopes (Because they are kind of repetitive) was a good first fit - because the scripts could be generated from stock fragments 'A chance encounter' etc...

I think when the little glitches are ironed out of 'real' TTS - we'll be awash with generated content.

planetsprite

I can imagine. Imagine a podcast generation system. You'd simply have to describe the personalities of the host(s), the topic, the general vibe of the theme music, length, hardcode any sponsors, and a GPT-4 powered MLops service could produce something liked by a list of demographics 8 times out of 10 in 5 minutes.

There really is no stopping this train. In 2030 90% of internet content adopting the guise of being from the "real world" will be entirely generated by machine learning models. 90% of conversations you have with strangers online will likewise be with bots catered to influence you in subtle ways to maximize the return on revenue of your attention.

holdenc137

In a parallel to email spam-bots, we get personal agents which (by your choice) filter out the generated content and make sure you only get the real deal.

The fight is real.

arrmn

Can you elaborate more on the TTS? Did you prerecord fragments (how many did you actually do?) and you just stich them together? So there is a may.mp3, 22.mp3 and your scripts just puts them together?

holdenc137

Sure.

For dates etc - you got it. I think from memory it would be 'Wednesday' + 'the 18th' + 'of' + 'may...' + '20' + '22'

For the narrative speech it would be more words in a file. There are plenty of files (EDIT: just checked 350ish files that cover all the variations of script that can be generated at the moment)

In general the TTS - part of the project is the 'art of the almost possible' (if TTS engines sounded really good - I'd have just used one of the shelf)

arrmn

How did you come up with the initial list of phrases? Did you do some kind of analysis of other horoscopes?

holdenc137

Listened to a few podcasts, read a few - then tried to come up with some combinations that (I hoped) were funny :)

Here's all of the current 'starts' for the main prediction:

(Note they're all pretty non commital - so anything could come next)

A bite from a wild animal

A financial matter coming to a head

Completion of a long delayed task

A seemingly generous gesture

A sudden realisation

An agreement with a headstrong peer

An unavoidable slowdown

Being pulled between two emotional options

A sudden eruption of feelings

Investigating a proverbial - light in the woods

Involvement with a purely privte project

Making peace with the past

The chance of a big win

Todays socialising

suprjami

This makes me dread that soon many other podcasts will be automated like this, and it'll be orders of magnitude more difficult to find good content than it already is.

null

[deleted]

jclos

One person's dread will be another's business opportunity - is there any good search engine/recommender system for podcasts?

holdenc137

Time to start lobbying for a 'generated'=true/false flag on RSS feeds?

Also, I promise to only churn out inane content for LOLs.

Li7h

You have a text error in the description. A daily horoscope podcast for Aquariums. Also the episode for May 22nd narrates the date as April 22nd. But I love this concept. Is this an AI speech engine or pre recorded snippets? Where did you get the text snippets from? Have you thought of incorporating GPT-3 into your horoscopes ala co-star?

holdenc137

The date was my mistake - the 'Aquariums' was for LOLs. (see also 'Librarians' and similar)

It's prerecorded snippets that came out of my mouth ;)

tobr

I can’t say I understand the point of this, so my only feedback is that the date in the episode from Saturday 21st of May is announced as “Thursday 21st of April”.

holdenc137

Yeah my bad - trust the human in the loop to put date in wrong.

As to the point, its programming practice, perhaps a stepping stone to more elaborate content-generation systems, and jolly good fun too.

edent

Disturbingly accurate in my case. I've seen many arcane things today - including matches.

(Which TTS are you using? Or have I misunderstood?)

papathunk

Haha.

Sadly I don't think any (commercial / phoneme based) TTS would be very listenable for a podcast. Those are hand rolled fragments of speech. ( Think old school Satnavs "In " + "30 yards " + "turn left"

planetsprite

What vocal synthesis program did you use? Sounds 100% real at parts.

holdenc137

Basically it is real. Because the possible scripts that can be generated are known - fragments of speech (eg 3,4,5 word phrases) were recorded (so the intonation is free).

Would be great to do it with an off-the-shelf TTS engine but I don't think there quite there yet. I know my recording skills and microphone technique is rubbish - but if I knew what I was doing on that front - I think you'd be really hard pushed to tell it was stitched together phrases.

planetsprite

The potential is 100x more with vocal synthesis imo. No need to make programmatic mad-libs style formats. Complete freedom, even though the quality isn't optimal.

holdenc137

Totally agree. I think we're probably only a year or so off TTS that can put some proper intonation into a sentence - hopefully then they'll be indistiguishable from live speech.

I've tried to listen to books with today's TTS and it soon becomes really grating (To my ears at least). It only needs the tinyest slip every few sentences and you can't listen any more.

vanous

Interesting! Is the code for the automation available?

papathunk

Will clean it up if there's enough interest. It's in a few parts ... 'transcript' generation, then the speech assembly... then dropping in the backing track + intro / outro, then uploading.

Which bit's of interest?

jasondigitized

I’d love to see it to. The l-system you mentioned and how you are stitching together the audio clips.

mro_name

whatever there may be, it's drowned in ad- and spyware.

papathunk

hmmm - it's on Anchor (spotify owned) - I wonder where the spyware is coming from.

number6

So how die you do it?

holdenc137

The generation of the 'script' uses a kind of L-System - like production rules. There's a big file along the lines of:

[the podcast] = [intro] [main body] [outro]

[main body] = [main prediction] [lucky colours] [alibi] etc

// these rules finally break down to text, eg

[main prediction start] = "A bite from a wild animal" or "A chance encounter"§

So the script has lots of combinations and is semi random - but it should always make sense.

null

[deleted]

null

[deleted]

visox

how many listeners did you got already from this ?

holdenc137

I put an episode up for each star-sign this weekend.

Most of them have had a 40-50 listens but the one linked here has 400+ listens!