by Rich Jones, 11-07-2017.
tl; dr: I got frustrated, made a bot, deployed a bot, lost a bot, and now I'm going to share it with you!
So, in a momement of frustration, I wrote a bot which will automatically upload all Reddit video player submisisons to YouTube.
To do this, I used the awesome Python library
praw, as well as
NoDB for storing data, and, of course,
Zappa for deployment. In addition, I use
youtube-upload for downloading and uploading the videos, which are powered by
ffmpeg under the hood.
This is a little bit janky, hour-long hack bot, but hopefully it'll help you to see how a Zappa bot gets built.
In a new project directory and virtualenvironment,
pip install praw nodb zappa youtube-dl google-api-python-client progressbar2.
youtube-upload isn't available on
pip, so you'll need to
git clone https://github.com/tokland/youtube-upload into your project directory. Next, grab the
ffmpeg build John Van Sickel's static ffmpeg builds page and extract it into your project directory.
Next, you'll need to create a new Reddit account for the bot to post under. Then, go to preferences and create a new application. Make sure you create a
script application, rather than a
Now, do the same for the YouTube account. Make a new "channel", then get an API keypair and authorize YouTube access for that keypair. Because of how YouTube uploads work, you'll now need to upload a test video manually with the
youtube-upload executable in the
bin folder. This will give you a link to visit in your browser to confirm your account, and will generate some credentials file in your home directory. Move this credentials file to your project directory.
You'll also need to make a new private S3 bucket to store your
Now, create a
bot.py. First, our imports and our NoDB set up:
Next, make a main method and use
praw to read the stream of new posts to the reddit video domain.
For each of the items in the stream, check the NoDB to see if we've already processed this item, and if not, fire off the item processor.
Now that we've got the submission we want to mirror, let's download it!
First, we want to make sure we don't accidentally upload any pornography to YouTube, so we'll remove all NSFW posts and a few NSFW subs.
After that, we'll get the HLS stream from the submission item and call
youtube-dl on it.
There are a few important things to notice here. First, note that we are being lazy and calling another python executable directly. On Lambda, this is located at
/usr/bin/python2.7. Next, we are downloading to
/tmp, since this is the only writable directory on Lambda. Finally, we are passing the location of our own
ffmpeg build for
youtube-dl to use.
Now that we've downloaded our file, we need to upload it to YouTube. We'll use a similar process of calling the
python executable directory, but with a twist to get around some of funkiness of Google's client library.
Notice that we supplying the locations of our
credentials-file from our local directory, and, most imporantly, defining a custom
PYTHONPATH for the
env of this execution. Finally, we get the
video_id of the result.
Finally, we create a little Markdown comment with a link to our new video, and we post it as a reply to the video submisison:
Now, we're ready to deploy! First, type
zappa init and point it to your source bucket and region. Since we don't need an API endpoint for this project, we can set
false in our settings.
Next, we'll want to set up a scheduled event for our function, so we create the following item for our settings file:
zappa deploy, and you're up and running!
Figuring some of this stuff out took a few tries (especially the weird stuff with setting the
PYTHONPATH.) To help debug this, I used some handy
zappa tail, which tails the Zappa output logs,
zappa invoke, which directly invokes a function, and
zappa invoke --raw, which directly executes supplied Python code in the remote environment.
zappa invoke --raw allowed me to invoke testing code without uploading the whole new package every time, which saved a bunch of time!
Response to the bot is mixed so far, some people seem to really love it, but some people seem to hate it too.
Unfortunately, the bot immediately uploaded some NSFW content that made it past the filter and got a content strike from YouTube. Shortly after that, it got banned from the Aquariums subreddit (TIL..), so it seems not everybody loves the bot. I've kept it running for now, but I doubt it will survive too much longer.
Maybe in the future I'll change the filtering to be whitelist based rather than blacklist based and have it only run in subreddits where the moderators approve of it, or only for subreddits that I care about.. so basically only for r/skateboarding and r/trap.
Still, I wanted to use this as an opportunity to share some useful information on how to build serverless reddit bots, and how to debug Zappa applications.
Got comments? Shoot me an email to rich [[at]] pubmail [[dot]] io!