Tags

, , , , ,

In Part 1 we introduced XML-RPC and the wp.newPost method that allows a new post to be created in a WordPress blog. We created enough framework to show how to move a post to WordPress, but did not create a complete program.

In this part, we will flesh that out to be a simple command line utility that can read a file containing Markdown optionally starting with a simple YAML block and make a new post from it.

Implementation

Better module path update

In part 1, we started with Lua for Windows and added Lua XML-RPC. The latter module takes the form of a folder, and should be installed alongside the script we are developing. For expediency, the previous post described a one-line tweak to the package.path variable that works well enough for just playing around. But for a script we might actually deploy (even if only into as personal folder on our PATH) it would be better to set things up right. Lua for Windows happens to include the Penlight library, a catch-all of useful functions and tools. For now, we are most interested in the function app.require_here() found in the pl.app module. Calling this function from a script executing in the stock Lua interpreter causes it to edit package.path to include references to the folder containing the script file itself.

Since we don’t actually need any of the rest of the pl.app module, we will simply take advantage of the fact that require "pl.app" returns the module table which we can immediately index to call require_here().

With this change, the top of our file becomes:

require "pl.app" .require_here()
require "xmlrpc"
require "xmlrpc.http"

Quoting and posting as before

The rest of the code from part 1 follows unchanged:

---
-- Replace certain significant characters from a string with their 
-- corresponding HTML entity. Only the mandatory characters are 
-- replaced.
function htmlentities(s)
    local e = s:gsub("[%&%<%>]",{
          ["&"]="&",
          ["<"]="<",
          [">"]=">",
        })
    return e
end
--- 
-- Call the wp.newPost XML-RPC method on the named URL, with 
-- title, body, username, and password as specified.
--
function wp_newPost(title, body, rpcurl, username, password)

    title = htmlentities(title)

    local content = {
        post_type = 'post',       -- post, page, link...
        post_status = 'draft',    -- draft, publish, pending, future, private...
        --post_date = '2014-04-01 03:14:16'   -- post date, required if status is 'future'
        post_title = title,     -- Title
        post_content = body,    -- full content, including <!--more--> if desired
    }

    local ok, res = xmlrpc.http.call(
        rpcurl,         -- something like "https://blog.example.com/xmlrpc.php"
        'wp.newPost', 
        0,              -- BlogID, ignored by WP
        username, 
        password,       -- clear-text, so do be sure to use https, not http
        content         -- Post details
    )
    return ok, res
end

Making it useful

Read and post a file

To read the file into a string, we use another handy Penlight function, pl.utils.readfile(). It is a simple wrapper around io.open() and file:read() that returns the entire content of a named file in a string. Since Markdown is text-like, we’ll let it open the file in text mode, but binary mode is an option.

For simplicity, our draft post will use the file name as its title. If YAML were supported, the title could come from there. Or if more elaborate command line argument parsing were supported, an option could be implemented easily to specify the title. For now, however, the filename will do since the post will also be a draft.

We defer the question of storage and handling of the username and password, as usual, for later.

The rpcurl parameter is the URL of the XML-RPC control endpoint for the blog.

Warning: Since the password is sent in the clear, rpcurl really should be https: and not http:. This function does not enforce this.

From the documentation:

If your WordPress root is http://example.com/wordpress/, then you have:

  • Server: http://example.com/ (some tools need just the ‘example.com’ hostname part)
  • Path: /wordpress/xmlrpc.php
  • complete URL (just in case): http://example.com/wordpress/xmlrpc.php
local utils = require "pl.utils" -- could go at top of script

---
-- Make a new post from the body of a file, using the filename as
-- the title. The file is posted as-is, with no substitutions or 
-- other changes. 
function postFile(filename, rpcurl, username, password)
    local title = filename
    local content = utils.readfile(filename, false)
    if not content then return nil, "No such file" end
    local ok, res = wp_newPost(
        title,
        content,
        rpcurl,
        username,
        password)
    return ok, res
end

Simplest main function

Tying it all together into something that can actually run, we do the least amount of command line argument parsing and validation, then simply use the first argument as the file name, the second as the username, and the third as the password.

Rather than have a proper configuration file, the actual XML-RPC endpoint URL is coded here as https://blog.example.com/xmlrpc.php which must be changed to point to your actual WordPress blog where you know a username and a matching password.

assert(arg[1] and arg[2] and arg[3], "Usage: wppost file.mb user pwd")
local ok, res = postFile(arg[1], 
    "https://blog.example.com/xmlrpc.php", 
    arg[2], arg[3])
if ok then
    print("wppost success, new post is id " .. res)
else
    print("wppost failed:", res)
    os.exit(1)
end

When the script is run it prints a message containing the ID of the new post if all is ok. If it fails, it prints an error message which may be more or less useful to humans depending on which component saw an error.

Features to consider

At this point, the script is functional. It uploads a file as a draft post, with a post title that can be used to find the post in the All Posts table. But it could do so much more…

Options, help, better documentation

A real tool should have command line options, and should support at least a minimal amount of user help text describing how to actually use those options and the file. The pl.lapp module that is part of Penlight provides a self-documenting argument parser that makes it easy to implement a rich collection of command line arguments similar to what is available with most Linux utilities.

The code samples provided have used the most rudimentary of LuaDOC comment formatting. That arguably should be fleshed out to full documentation of the internals, and become the basis for a complete reference manual.

Provide a configuration file

For practical use, a configuration file should be used that remembers the posting URL, username, and possibly event the password.

If the configuration saves the password, then it should also take some care to lock it down so that other users of the PC have to know they are breaking a trust by reading it.

Handle just a little YAML

Both Pandoc and StackEdit demonstrate that mixing a taste of YAML with Markdown can be handy. Pandoc allows YAML to appear anywhere in the document as long as the --- marker appears at the beginning of a line and as either the first line of the file or after a blank line. StackEdit appears to support only a single YAML block at the beginning of the file, but seems to allow a blank line to precede it.

The YAML block is useful for carrying metadata about a document within the document file. This most obviously includes things like a post title and tags to apply. Pandoc uses it to also carry a citation database, and to provide data to fill out more elaborate document templates such as lists of authors along with contact information.

Handling YAML correctly would involve using a decent YAML parser to locate, strip out, and parse the YAML blocks. There is no YAML parser included in Lua for Windows, but there are several Lua wrappings of the various high quality YAML parser libraries written in C. Selecting and using such a parser is left as an exercise.

Tags, Categories, etc.

Properly handling features like tags and categories requires understanding and using the WP Taxonomy API to translate names to identifiers. (But there may be a grandfathered in feature that supports tags written as a simple comma-separated string. That would be easy to support.)

If YAML is available then use fields from it (as is done by StackEdit for tags) to specify these details. Otherwise (or also) use command-line arguments to set tags and categories.

If taxonomy and other advanced features are handled relatively completely, then it might make sense to support posting as other than a draft.

What about pictures

A nice feature to have would be to support automatic uploading of images referenced within the Markdown to the blog’s media library. The obvious candidates for upload would be images referenced from the PC via local file names. Pictures referenced from a pre-existing hosting location should probably be left where they sit and be ignored by this feature.

Along with automating the upload, it should take note of the new image URL, and fixup the document’s image references to the final location.

The idea is that a post that inserts the image of a duck by typing ![ducky](ducky.jpg) should have the file ducky.jpg uploaded, and then replace that reference with ![ducky](https://example.files.wordpress.com/2014/04/01-ducky.png) so that the final post has all its images in place. Don’t forget to handle all the supported kinds of URL references in Markdown!

Integrate with make

My personal desire is to have a command-line posting tool that I can integrate with my usual build process. That is, it should be possible to drive it from make. To achieve this, the tool should consume a document (and its mentioned images and related content files) and produce an upload ticket. If run again, it should check the online post against that ticket and the source files, and only repeat the upload if something is “different enough”.

With this, I can work offline with full revision control of my documents, process them with Pandoc to produce high-quality paper output, or publish them as blog posts as a source of external review of work in progress.

Fossil readily stores (and displays) Markdown, and make easily drives Pandoc to
assemble large documents from many files. It would be interesting if it could also publish those
pieces to a blog.