Static site generator burnout
January 23, 2017
I think by now it is not a secret anymore that I am a strong advocate of static site generators. Especially combined with some CDN making all your security and scalability concerns go away. When asked which one I recommend I usually throw out the two most used ones, Jekyll and Hugo and also sneak in my own, drupan, with the disclaimer that it is opinionated, but still proved to work quite well for different projects. Last year I did not really blog much, but I wrote a few posts. I did not really work that much on drupan, but there are new commits and a 2.4 release pending. What happened?
There are several reasons, not getting much work done was basically a result of a nearly 5 month crunch time at the beginning of the year followed up with even more stuff to do. I simply had to recuperate, gather some strength and inspiration, take a bit time off and code less. But during the whole time one of the things I planned to do first, once I am back in the game, was taking care of my blog.
Editing is a pain - I miss a CMS
I think by now we all know the advantages of static site generators, writing in markdown, having everything in version control,... Count me in on that. But writing markdown and writing blog posts are two different things. Blog posts have links, images, maybe a video? Manually copying around file names and URLs, hoping you got it right till you generate the site is annoying. Manually managing your files is even more annoying.
I simply miss a proper CMS. I was certain I will fix this with Sakebowl, but I am not that sure anymore. Having a flexible CMS that plays nicely with a static site generator is certainly possible. There are a few out there, some of which barely made it out of toy stage and some are actually usable, but far behind of a modern CMS.
This is a solvable problem, it is just not properly solved yet. This is also one of the reasons why I did not publish a lot last year.
Do not get me started on this mess. Write a file, send it to someone to review and edit, get it back, replace it, read it again to make sure the edits are correct and hopefully publish your article.
There are a few workarounds like keeping your whole site in a version control system and using a hosted service which allows editing files. This partly solves the problem and is not the worst idea. But it is also far from comfortable.
It also messes up your workflow to a certain degree. Right now I write most of my articles in Ulysses. After I am done I have to export them to a markdown file, drop to the command line, commit and push, send the link or review request to my fiancee, pull and build. Sure, it is doable, but to me it still sucks. If I would tell myself 10 years ago that at some point I will consider this an unacceptable workflow I am sure I would have declared my future self crazy.
Purely working in some online VCS hosting platform is also not really an option. Forcing people out of the applications they started using for some reason is, most of the time, a fight you will not win. Especially when it comes down to something where personal preferences are important, like an editor.
CDNs aka “love hate relationship”
Content delivery networks are, at least in my opinion, one of the best things that ever happened. Having files across the globe, highly available, served as fast as most consumer Internet connections allow for a stupidly low price? Sure, sign me up! And thanks to the fact that we are living in 2017 you can get SSL for free! Even better. How could anyone complain about this?! Well, believe me, I can.
So here is what usually happens: I write a blog post, get the editing mess done and publish it. But wait, it is not really published yet, it is only uploaded to S3. The Cloudfront invalidation will take five to ten minutes before it is properly published to all my readers. Sure, I could lower the TTL and skip invalidation requests, but that still means sitting around for some time. Or I could make it so low, that most likely every of the 1000 requests from users a day result in an cache miss making the request slower. For some reason I get a lot of bots hitting my site, but not distributed enough to keep the cache warm.
Now we could argue that this is kind of acceptable when you publish something. It does not happen that often, does it? But what about edits? When publishing a post it is possible that it is submitted to HackerNews or Reddit. Want to add a link? No problem, but readers will have to wait for the TTL, most likely resulting in some who would be interested missing the links or for the invalidation request to finish, most likely a lot worse.
This does not mean hosting your content on a CDN is a bad idea, it simply means that sometimes I want to update something in real time.
There is a certain set of new features I want to add to my blog. A few of them will be a matter of minutes, like adding a new category for short notes I use to post more frequently instead of splitting text on Twitter for example. But there are things, even with a proper CMS backend, which are somewhere between hard to achieve or greatly increase the backend complexity.
Subscribe to blog
I want my readers to be able to subscribe via email. Sounds simple, doesn’t it? If you have a purely static site the best option will most likely be using a third party service like mailchimp and manually sending out a mail or somehow integrating it as plugin in your static site generator. Since the URL for your content is known before you publish the site, even if your index and archive is not yet up to date thanks to the CDN, you can trigger a mail pointing directly to the post.
Post via email
Fancy, isn’t it? I want to sit on a beach in San Francisco - hey, I am allowed to have a weekend, even on a work related trip! - and post a picture. Partly to annoy my friends back at home suffering through 8°F days or to inform others that I am currently in SF, which automatically means I am up for a coffee or beer. Don’t ask why, but this sometimes works better than sending a tweet or email.
Sure, I could login to a web interface, if I had one, and use this to post. This would even be acceptable, but sending an email is faster. Also I like email. It is one of the few things that never messed up badly during the last 20 years.
Actually posting from anywhere, even if it would involve a web interface, would already be a big step into the right direction.
Queues, workers, complexity
Drupan is written in Python. And Sakebowl, the backend, is partly written - as in started but not finished - in Python and Django. Adding new features nearly always requires having a queue, even as simple as Redis, and having workers running doing the heavy lifting. Simple enough, isn’t it? Especially because those are things I am dealing with on a daily basis for over 12 years by now.
But something in me is telling me it is not worth it. Having all those different things running, making sure they are available, come up on restarts, handling errors - it is some complexity I am not sure I am willing to maintain in my spare time only to publish a few articles.
There are currently no alternatives I would be satisfied with. I will never move my content to a third party platform like Medium. I will not start messing around with some stale Node.js abomination that consumed an insane amount of crowd funding money for no apparent reason. But I am also not sure Sakebowl will be the right call.
Currently I am considering moving from a static site generator to a more mainstream CMS or blogging software. I am not too worried about uptime to be honest. Nearly all hosting providers I have dealt with during the last 5 or 6 years rarely have any downtime and if they do it is a matter of minutes. I actually had more downtime on AWS (admittingly not S3 or CF!) in 2014 if I recall correctly than on my Hetzner vserver. Sure, different beast, but you get the point.
I am also not too worried about content being linked on some popular site resulting in a spike in requests. Having the site or page cached in memory should not be hard for any serious blogging platform. And it will most likely be “good enough”.
But I still have this point somewhere in the back of my mind. A blog going down because it makes it to the frontpage of HackerNews is just lame. It would still possible to put a CDN in front of whatever, so this should not turn into a problem, but this also would not be much of an improvement to the current setup.
Security is a tough cookie. I do not believe any system can beat the current S3 + CloudFront setup. But at the same time I believe it is possible to get a system secure enough that, while not directly targeted, it will survive the big, bad Internet. If directly targeted, well, it is most likely game over. If this turns out to be a major problem it is something I cannot predict.
I would prefer having a simple setup. Installing an interpreter, a database, a key value store, a webserver, messing with some obscure startup script to get application servers and workers running is something I would love to avoid doing in my free time.
Are static site generators a bad idea?
Sure, that would make a great headline and most likely guarantee a heated discussion, but sorry to disappoint you. They are as good, or bad, as they always were. I am simply looking for features which are not that easily realisable using a static site generator and which are maybe also not that well realised with an backend for one.
I believe static site generators are definitely the right choice for anything that is not updated frequently and has a technical editor. You can even make it simple enough to make copy changes - GitHub + some CI and you are good. Not that comfortable, maybe a small fight against existing tools, but you could manage to get away with it. Agencies maintaining websites for customers? Definitely! Chances are pretty low that your friendly doctor needs any dynamic elements on the website, at least in most parts of the world. Marketing site for a SaaS business? Yep. Decoupling your marketing site from your actual app is not the worst idea.
I still love drupan and I am actively working on it. But for the things I want it seems to be a bad fit for it - this is actually harder to admit than it should be. I am currently not sure what the correct solution will be, but I am actively working on it.