TheJach.com

Jach's personal blog

(Largely containing a mind-dump to myselves: past, present, and future)
Current favorite quote: "Supposedly smart people are weirdly ignorant of Bayes' Rule." William B Vogt, 2010

Blog rewrite notes

Last time I wrote about a minor itch to rewrite this blog from PHP to Lisp. I started off discussing the database schema, this time I'll briefly mention a few of the operations the code has around the data.

(Like many websites, ultimately this is just a nicer looking interface to a database.)

Websites can also be modeled as functions mapping routes to responses. This is how my code is structured. Every request (unless whitelisted otherwise) goes through index.php. It uses the first segment of the route, such as /view, to determine a more specific "service" to handle the rest of the route if it can. The services I have are:


switch($service) {

// Secure pages
case 'admin': $serviceClass = 'AdminService'; break;
case 'account': $serviceClass = 'AccountService'; break;

// Public pages
case 'user': $serviceClass = 'UserService'; break;
case 'feed': $serviceClass = 'FeedService'; break;

// Home (also public)
case 'index':
case 'home':
case 'about':
case 'view':
case 'comment':
default: $serviceClass = 'HomeService'; break;

}


Each service has a routemap to functions. HomeService is a bit bloated... But let's look at each service in turn:

For AdminService:


$this->urls = array(
'get,post:/admin' => 'manage_posts'
, 'get,post:/admin/post/@action' => 'manage_posts'
, 'get,post:/admin/post/@action/@id' => 'manage_posts'
, 'get,post:/admin/users/@action' => 'manage_users'
, 'get,post:/admin/users/@action/@id' => 'manage_users'
, 'get,post:/admin/comments/@action' => 'manage_comments'
, 'get,post:/admin/comments/@action/@id' => 'manage_comments'
, 'get,post:/admin/comments/@action/@id/@delete' => 'manage_comments'

, ':/admin/loadbayes' => 'load_blog_data'
);


This is not a great breakdown. Each page typically contains a form element which posts back to the same page, so each route handles both GETs and POSTs. (The last one also handles both, technically, but is meant to only handle gets -- but not be cacheable. I have a convention that a handler annotated with only get is cacheable.)

For managing posts, the possible actions are new, and if an id is given as well, edit and delete. For where it makes sense, these actions also have views to edit data in (like the new post form) before performing the final action (publishing a new post).

One enhancement I'd like to have is the concept of draft vs published. I've lost a few posts by relying on the browser remembering form input, and then for various reasons after crashes/restarts of the browser it forgot it. Having a draft (with auto-saves) would be a nicer experience.

For managing users, again I can create new users (or rather I planned to, it looks like that feature was never done) myself, or edit their properties (another thing not implemented), or view and delete them (I did implement this). None of this is really necessary in a newer design that does away with most user management apart from my own user.

Managing comments has a few actions. Remove, which replaces the comment's text by a message saying it was removed; delete, which actually deletes the whole comment; and edit, which lets text be edited. I use the edit feature to manually create anchor links for some comments that leave URLs. This is so I can actually visit them and decide if they're worthy of linking. Anyway, editing visitor comments is quite useful.

The last function, load_blog_data, is one I forgot about until now. The docstring says to rerun it from time to time because if I edit a post, the naive bayes training data is still based on the original. In essence this just retrains the naive bayes classifier with all blog content. I don't make very significant edits even when I do edit, so that's probably part of why I forgot about this. Still, a slightly useful feature, and one that probably ought to be exposed with a UI button somewhere if I keep it.

While not exactly apparent from the routing, I have a 'naive bayes' button that when clicked returns the most likely categories it predicts the current post belongs to. Tag classification might be a rare feature on other platforms, I like it. I have a 'preview' button as well that shows me the post instead of posting it. I also have a few JS-based utility buttons while writing posts, like escaping angle brackets (posts are edited as HTML, apart from line breaks) and Linkify to go through and auto-a-link things.

The AccountService is the worst... Right now it contains a handler for /account/change_password, but that will result in a page saying it's not implemented. There are basically no features for registered accounts (apart from not having to solve the simple captcha) -- another reason to kill support.

UserService has:


$this->urls = array(
'get,post:/user' => 'display_register'
, 'get,post:/user/login' => 'login'
, 'get,post:/user/logout' => 'logout'
, 'get,post:/user/register' => 'display_register'
, 'get,post:/user/register/handle' => 'handle_register'
, 'get,post:/user/forgot' => 'display_forgot'
, 'get,post:/user/forgot/handle' => 'handle_forgot'
, 'get,post:/user/reset/@u_id/@conf_code' => 'reset_password'
, 'get:/user/benefits' => 'display_benefits'

);


Pretty straightforward. As mentioned before, forgot/reset was not implemented. Registration, login, and logout still work though. Login also sets a cookie with a very long expiration date. Something I wish more sites still did.

FeedService handles my RSS feed:


$this->urls = array(
'get:/feed' => 'display_rss_feed'
, 'get:/feed/rss' => 'display_rss_feed'
, 'get:/feed/atom' => 'display_atom_feed'

);


Apparently I never implemented atom style. Maybe someday! When Firefox nuked its native RSS support, the extension replacement isn't quite as nice and it seems to treat my blog especially bizarrely with e.g. the favicon icon not showing up for each item. I think Atom might fix that...

Before I get to the big service, apparently there's also a "BlogFormService".. but in reality this is more of a utility class. I think I wanted to merge it with UserService at some point. Anyway it handles a couple things like my ghetto captcha sysem and username existence checks for registration or commenting.

Now the big one... "HomeService".


$this->urls = array(
'get:/index.php' => 'display_home_page'
, 'get:/home' => 'display_home_page'
, 'get:/index' => 'display_home_page'
, 'get:/home/@page' => 'display_home_page'
, 'get:/home/page/@page' => 'display_home_page'

, ':/about' => 'display_about_page'
, 'get:/view/tags' => 'get_tag_list'
, 'get:/view/tags/@tag' => 'display_by_tag'

, 'get:/view/id/@id' => 'display_post'
, 'get:/view/posts/all' => 'display_all_posts'

, 'get:/view/posts/count/@year' => 'count_posts_from'
, 'get:/view/posts/count/@year/@month' => 'count_posts_from'
, 'get:/view/posts/@year/@month' => 'get_posts_from'

, 'get:/view/@year/@month/page/@page' => 'display_by_date'
, 'get:/view/@year/page/@page' => 'display_by_date'

, 'get:/view/@year/@month/@name' => 'display_post'

, 'get:/view/@year/@month' => 'display_by_date'
, 'get:/view/@year' => 'display_by_date'
, 'post:/comment/new_comment' => 'new_comment'
, ':/comment/unsubscribe/@email/@post_id' => 'unsubscribe_email'
, ':/portfolio' => 'display_portfolio_page'
, ':/portfolio/@singlepage' => 'display_portfolio_page'

);


I've omitted a route that could be used to DoS my database, but otherwise this is complete... (It just calculates various statistics.)

I say it's big, but the actions it supports are relatively straightforward. The home page shows the several most recent posts. Then there are various views. You can view tags (this returns a json response though), or more usefully posts that have a specific tag. You can view a specific post. You can view all posts. You can view posts by a year, or by a year and a month. (There's also some json-responding counts for each to support the navigation.) Each view of multiple posts besides the "all" or "by tag" is paginated.

Also hidden in here is the form handler for new comments. I think I might have not put it as part of the user service since technically "guests" aren't users (except at Salesforce). And it was convenient code-wise as many helper methods like getting post data (for e.g. the post title) are defined on the service itself instead of separately. MVC? I hadn't heard of it when I started this design!

Because guests can also provide their email to receive notifications (with each email offering them the ability to unsubscribe) I handle that action here too.

And that's it!

There's a bit of special logic in the sidebar for the navigation, recent posts/comments, and tag cloud, but that's all the features this blog has.

So it shouldn't actually be that much work to port it over to Lisp. There are a few architectural differences I'll need to consider, as well as opportunities. The number one difference is that the normal way to write a Lisp webservice is not like PHP -- that is, it's not the stateless CGI model, with each request spawning a new process. Rather a persistent Lisp process is always running, much like the battle-hardened webserver in front of it (apache, nginx, etc). I rather like the stateless model, and you can program close to it, but realizing your model isn't that allows for some new designs. For instance, instead of having my caching implementation based around saving files and then reading files, I could just keep the cache in Lisp's heap memory.

Also in the PHP world, each request that requires a DB connection establishes one at the start of the request, several queries are computed to render a page, and then it is released. Before I had the cache system, every request would do this, and if I got too much traffic, I'd quickly overload concurrent connection limits. (This was especially painful when I was on shared hosting with host gator.) With a persistent process, it makes more sense to have a connection pool, and whether or not each connection is held open indefinitely or not, the pool itself ensures that we can just block and wait for a new connection to query with when under load. Much better than the error message spam before.

Editing or adding new code also changes. I have edited a lot of code for this site in production... also, the way the content system works, content is usually within '.tpl' template files. Sometimes this is a small thing, like the 'recent posts' bit of the sidebar is its own standalone tpl file that gets injected into the larger framing 'maincontent.tpl'. Sometimes it's just a big wall of HTML text (with PHP able to come in at any time), like my About page. Anyway, when you edit PHP, you just save, and now every new request will use the new code. A Lisp system will need to be a bit more sophisticated -- I'll have to actually connect to the running Lisp REPL and reload the edited file. I don't expect this to be a problem in practice, but it's slightly more cumbersome. I've read some Lispers have a process that "launches" their Lisp server in a tmux or gnu screen session by loading the Lisp program normally with local swank support for emacs/vim/anything with slime to connect to, and typing (launch) or whatever.

If one wants to get away from the idea of editing in production, and perhaps consider a forced Lisp program restart when deploying new code (this is the standard model for most other non-CGI language servers, from Python to Node to Java to...) then I can actually just ship a single binary file and execute it once, then the server program is deployed. Though if I went this route I'd also be tempted to put all assets (even images) into the database, and end up with 0 reliance on the filesystem... but of course, even with file dependencies, that's what deployment pipelines are for so that it's not just an executable, but an executable + set of files in the right spots.

So at last I'm faced with the question.. should I do the rewrite while ignoring all the existing code (even SQL statements?) and just focus on this self-documentation for what the schema looks like and what actions I need to support? Or should I let myself reference the code, at the risk that I copy the newb structures?


Posted on 2020-02-24 by Jach

Tags: lisp, php, programming

Permalink: https://www.thejach.com/view/id/370

Trackback URL: https://www.thejach.com/view/2020/2/blog_rewrite_notes_-_supported_actions

Back to the top

Back to the first comment

Comment using the form below

(Only if you want to be notified of further responses, never displayed.)

Your Comment:

LaTeX allowed in comments, use $$\$\$...\$\$$$ to wrap inline and $$[math]...[/math]$$ to wrap blocks.