Tags

, , , , , ,

This is an article in a series of unknown length discussing a tool written in Lua to publish posts on a WordPress blog from a PC. The source code is available in a public fossil repository.

  • Part 1 showed how to use XML-RPC and wp.newPost
  • Part 2 added file reading to make a minimal working utility.
  • Part 3 added amenities and made the utility useful.
  • We digressed to talk about OAuth and using it at WordPress.com.
  • Part 4 switched to cURL, REST, and OAuth, but can’t yet post.
  • Part 5 improved the token handling and user display and still can’t post.
  • Part 6 implemented posting and added the ability to set title, tags, and categories from command line options.
  • Part 7 added a new utility and created modules common to all our utilities.
  • We digressed to build a tiny embedded web server to make authentication easier.

In this installment we will use the Spoon web server to collect the OAuth token credentials directly rather than depending on the user to copy and paste information manually from their web browser to our command line. The result will be that getting authorized will need only a single typed command.

Why?

OAuth is designed to use a brief web interaction where the user signs in to their data provider and authorizes the application’s use of their private data. For best and clearest security, the sign in and authorization steps should be carried out using the system’s default web browser. The final step of that process has the user’s browser fetch a page from an URL that is specified as part of the application’s registration, with data in the URL that specify the token and related details.

As a simple command line utility, however, the WPCLI tools don’t, can’t, and won’t contain a web browser. But they could contain a tiny web server now that we have Spoon available.

To use this, the application registered with WordPress.com must be changed so that its registered Redirect URL can point to our internal web sever, and the server needs to both display a document to the user and capture the token and supporting details.

The scenario would go something like this:

  1. User asks a WPCLI tool to authenticate with their blog: wppost.lua --authenticate
  2. The tool launches the user’s default web browser on the WP authentication page.
  3. The tool starts its internal web server.
  4. User agrees this is what she wants, fills out the form and submits the authorization.
  5. Browser fetches http://localhost:8080/token/#...
  6. javascript rewrites the fragment into a query
  7. Browser fetches http://localhost:8080/token/?...
  8. Spoon captures the token, expiration, and site codes from the fetched URL and returns a reassuring document for the browser to display.
  9. The tool stores the token in its configuration file for later.

One unexpected hiccup is represented by steps 5 to 7, which in earlier discussions were thought of as a single step. For ease of use by browser-based applications, WP returns the token as an URL fragment rather than a query. This is easy for in-browser javascript to handle, but by design this part of the URL is never provided to the web server when a document is fetched.

Since we need the information in our server and not stuck in the browser, we need to have the browser rewrite its URL and fetch the document a second time. This is handled by a small fragment of javascript in the page we provide for the /token document.

<script>
if(window.location.hash) {
  window.location.assign(window.location.href.replace("#", "?"))
}
</script>

This uses the broadly available (well, at least post IE8, but available in most versions of FF, Opera, Safari, and other major browsers) property of the current window window.location.hash which if set, contains the complete anchor text of the window, including the hash character. This script will be found in the document header, and simply causes the browser to fetch a new document where the # has been replaced by ?. There may be another way to get the effect, but this seemed simple, direct, and worked.

Note that there is one minor complication: the port number on which we run a server is formally part of the URL known to WP as part of the application registration. But if the port number is already in use on the user’s PC, we can’t use it for authentication. Detecting this and generating a suitable error is left as an exercise for later.

Implementation

The previous discussion of Spoon describes its implementation and use. Since that article, some minor changes have been introduced. Most notably, Spoon has been improved to capture the entire HTTP request including all headers. The request is now also fully parsed into tables that contain the various document path components and query parameters. The parsed versions have also been processed to remove the various encoding schemes that are layered on top of the URL. Spoon also reflects the escape and unescape functions from the socket.url module into its module table where they can be seen without further reference to either socket or socket.url.

The implementation of the full request packet reader closely follows the guidelines described in chapter 3 of RFC7230. See the code in readreqeust() found in spoon.lua for the gory details. In outline, it reads the request line and validates it, then reads the headers, then reads a body document if one is present and a transfer-encoding is not specified. Along the way, it is willing to error out and generate status codes 400, 500, or 501 as suggested by the RFC.

The options table provided to the spoon() function has some minor changes. The most visible is that the callback functions for each request method now get just two parameters: a reference to the options table and a table containing everything known about the request including the parsed path and query.

Since we need to handle at least two requests, we will use the loop field of the options table. It will be to true when we start the server to handle additional requests as long as it remains true. This makes it easy for a document handler to stop the server after delivering the current document, a feature which will be handy for the command line tools.

The web server is added to the general utilities by planning for the --authenticate option. We add the function handleAuth() to cli\common.lua:

function M.handleAuth(args)
  if not args.authenticate then return false end
  local authURL = [[https://public-api.wordpress.com/oauth2/authorize]] 
  .. "?client_id=36745"
  .. [[&redirect_uri=http%3A%2F%2Flocalhost%3A8080/token/&response_type=token]]
  M.launchBrowser(authURL)

The client_id is that of the WP CLI Tools as registered by me. If you are adapting this code to tools of your own, you should register your own application with WP and edit that value as needed. The launchBrowser() function is not shown, but is expected to ask the OS (only Windows at the moment) to launch the browser asynchronously. In Windows this is done by executing a command that looks like start "" "https://.../?..." The idea is to get the user’s default browser to open the authentication URL at WordPress.com.

Note: I have changed the registered redirect_uri for the WP CLI Tools to http://localhost:8080/token/. This means that older versions of the tools will continue to work, but the old instructions for manually retrieving a token likely will not unless you have some sort of web server listening on port 8080 locally.

Continuing in M.handleAuth(), we call spoon.spoon to run the web server, configured with an options table containing an anonymous function as the GET handler:

  spoon.spoon{
    port=8080,    -- Use localhost:8080
    loop = true,  -- Respond to multiple requests
    verbose=true, -- Chatter on the terminal

    -- handle GET
    GET = function (opt, request)
  local tokendoc = [===[
<html>
<head>
<title>WP CLI Authorization</title>
<script>
if(window.location.hash) { window.location.assign(window.location.href.replace("#","?")) }
</script>
</head><body>
%s
</body>
</html>
]===]
  local u = request.puri
  local p = u.ppath 
  if p[1] and p[1]:match"^favicon" then return spoon.Errorpage(404, "no favicons") end
  if p[1] == "robots.txt" then return spoon.Errorpage(404, "no robots") end
  if p[1] == "token" then 
    if request.puri.query then
      -- actually process the token info from the query and store it in the 
      -- global args
      local t = request.puri.pquery
      args.token = spoon.escape(t.access_token)
      args.expires = tostring(os.time() + (tonumber(t.expires_in) or 0))
      args.site = t.site_id
      args.writeconfig = true

      local s = ([[
        <h1>WordPress Token</h1>
        <p>You have authorized the WP CLI Tools to access site ID %s 
        for the next %3.1f days with this bearer token:<p>
        <pre>%s</pre>
        <p>You can close your browser now.</p>
        ]]):format(args.site, t.expires_in/24/3600, t.access_token)

        -- tell the document fetch loop that it should stop waiting
        -- for further requests.
        opt.loop = false
      return spoon.response(200, tokendoc:format(s))
    else
      -- Provide the doc with the URL rewrite script
      return spoon.response(200, tokendoc:format[[
        <h1>WordPress Token</h1>
        <p>If you can read this, something went wrong.</p>
        <p>The WordPress access token should have been in the URL following
        a hash mark. If it is there and you can see this then you likely have 
        disabled javascript or have an out of date browser.</p> 
        <pre>request = ]]..pretty.write(request)..[[</pre>
        ]])
    end
  else
    return spoon.ErrorPage(404)
  end
end

  }
  return true
end

When the closure opt.GET(opt, request) is called, it decides what to do based on the request. It returns 404 for anything other than /token/, and decides which of two documents to return for that path based on the presence of query parameters.

When no query parameters are found, it builds a document that the end user should never see, since the javascript in its “ is expected to replace the window’s current window.location.href to change the URI fragment (found in window.location.hash) into a query. For debugging, the unseen document will contain a dump of the request.

When query parameters are found, it uses them to update the args table containing the command line options passed in to the script to contain values for --token, --expires, and --site as well as to assume that the --writeconfig option was also specified.

The final detail is to add the --authenticate option to the utilities, and to call on handleAuth() to deal with it. In wpget.lua the usage text now reads:

-- Put the pl.lapp based options handling near the top for easy visibility
local args = lapp [[
Retrieve a WordPres blog post to a file or stdout. Part of the WP CLI Tools.
https://curiouser.cheshireeng.com/applications/wp-cli-tools/

These options are related to the config file, with the ones marked * actually
stored in the file. Either --blog or --site and --token must be available and 
consistent for posting to be allowed. 
  --authenticate              Do web-based authentication and write config
  --blog (default "")         *The blog at which to post.
  --token (default "")        *The OAuth token from the redirect URL.
  --expires (default "")      *The OAuth token expiration date.  
  --site (default "")         *The WP Site ID for the token's blog.  
  --tokenurl (default "")     The full URL containing the token
  --showconfig                Just display the config file
  --writeconfig               Write the config file with the options  

General options:  
  -v,--verbose                Be more chatty about the process
  --keepraw (default "")      Name a file to fill with raw logging  
  --debug                     Don't use this.

Options for the blog post itself:
  --ID (number)               The post ID to get.  
  --out (default stdout)      The file to write.
]]

-- erase some optional fields from the args table completely
common.clearoptional(args, {'title','tags','category','keepraw'} )

-- handle web-based authentication
common.handleAuth(args)

Similar changes are made to wpput.lua.

With these changes, a command like wpput.lua --authenticate launches a web browser and expects you to sign in and agree, then returns the finished token to the waiting web server where it is safely stored away in your WP CLI configuration file.

What Next?

The next big step is to think about integrating a fossil repository full of posts written in Markdown with a blog, using the WP CLI tools as the communications channel. Doing this successfully will likely drive changes in the tools so that PC side scripting can avoid reposting document that have not been edited on the PC. Allowing for edits made in the WP web UI would be bonus.

Another big step to consider is some amount of handling for media other than the post itself. WP supports pictures, audio, and video. A WP post can also have a “featured image” associated. Being able to handle posts with media and featured images in a useful way could also have value.

Repository and Checkins

All of the code supporting this tool is in a public fossil repository. See the discussion at the tools page for how to get started using this repository. This post documents work that was checked in as [8d3e666f64].


(Written with StackEdit.)