22 Jun 2014, 19:35

A simple Node.js script to upload your Harp.js blog images to S3

Whilst there are tons of tools to do S3 upload, I wanted something tuned to how I blog, particularly with a static blog that used to be a WordPress one with year/month directories for images.

Usage:

node upload_s3_images.js image1.jpg image2.jpg image3.png etc

It uses your AWS credentials in ~/aws_config.json { “accessKeyId”: “akid”, “secretAccessKey”: “secret”, “region”: “us-west-2” }

It uses the bucket name and root upload directory from ./s3_blog_config.json {“bucket”: “conoroneill.net”, “rootUploadDir”: “wp-content/uploads/”}

So if I have a file called s3_upload.jpg on my Desktop and I invoke as follows:

node upload_s3_images.js ~/Desktop/s3_upload.jpg

I end up with a file at: https://s3-eu-west-1.amazonaws.com/conoroneill.net/wp-content/uploads/2014/06/s3_upload.jpg

// upload_s3_images.js
// Upload a space separated list of images to an S3 bucket for use by your Harp.js static blog
// node upload_s3_images.js image1.jpg image2.jpg image3.png etc
// Uses your AWS credentials in ~/aws_config.json { "accessKeyId": "akid", "secretAccessKey": "secret", "region": "us-west-2" }
// Uses bucket name and root upload directory from ./s3_blog_config.json {"bucket": "conoroneill.net", "rootUploadDir": "wp-content/uploads/"}
// My blog is exported from WordPress so it uses /wp-content/uploads/year/month/ as the directory stucture

// Copyright (C) 2014 Conor O'Neill
// Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
// The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

var AWS = require('aws-sdk');
var fs = require('fs'); 
var userHome = process.env.HOME || process.env.HOMEPATH || process.env.USERPROFILE;
AWS.config.loadFromPath(userHome + '/aws_config.json');
var blogConfig = require('./s3_blog_config.json');


var fileList = process.argv.slice(2);

fileList.forEach(function (val, index, array) {

  var fileName = val;
  var currentTime = new Date();
  var currentYear = currentTime.getFullYear();
  var currentMonth = ("0" + (currentTime.getMonth() + 1)).slice(-2);

  var uploadName = blogConfig.rootUploadDir + currentYear + '/' + currentMonth + '/' + fileName.split('/').pop();
  var fileBuffer = fs.readFileSync(fileName);
  var metaData = getContentTypeByFile(fileName);
    
  var s3 = new AWS.S3(); 

  var params = {Bucket: blogConfig.bucket, Key: uploadName, Body: fileBuffer, ContentType: metaData};

  s3.putObject(params, function(err, data) {
    if (err)       
      console.log(err)     
    else console.log("Successfully uploaded: " + blogConfig.blogS3Url + blogConfig.bucket + "/" + uploadName);   
  });

});

function getContentTypeByFile(fileName) {
  var rc = 'application/octet-stream';
  var fn = fileName.toLowerCase();

  if (fn.indexOf('.html') >= 0) rc = 'text/html';
  else if (fn.indexOf('.css') >= 0) rc = 'text/css';
  else if (fn.indexOf('.json') >= 0) rc = 'application/json';
  else if (fn.indexOf('.js') >= 0) rc = 'application/x-javascript';
  else if (fn.indexOf('.png') >= 0) rc = 'image/png';
  else if (fn.indexOf('.jpg') >= 0) rc = 'image/jpg';

  return rc;
}

Source is here on GitHub.

I hope you find it useful.

13 Jun 2014, 18:11

Thanks to Elon Musk's position on patents, I now have a Tesla S

Well……almost

Tesla S

Printrbot Simple

11 Jun 2014, 08:48

How Twitter can make one beeeeelllllllion dollars in the next few weeks

After 7 long years, Twitter finally added people muting recently. Meanwhile those of us who never use the official apps and use Janetter etc instead have had powerful keyword filtering for years. I see Twitter has some silly “which team do you support in the World Cup” nonsense now. But they really aren’t thinking this through. There are many many many many people on Twitter who would happily never see a World Cup tweet. Why not make that happen as an opt-in but instead of just hiding them, replace each World Cup tweet with an Ad for something non-sport-related (in third party clients too). Finally a sustainable business model for Twitter. For the summer at least :-) You’re welcome Twitter.

02 Jun 2014, 12:56

PirateBox could be an awesome file sharing tool at every event, particularly where internet is poor i.e. all of them.

One thing I’ve run into many times is the lack of internet bandwidth at events. It doesn’t seem to matter whether it is a big pay-for thing or a BarCamp - once everyone was online, things slowed to a crawl. This was a really big problem at Dojo Camp last year where we had to get a ton of kids setup with the Arduino IDE and other tools. In the end it was faster to pass around a USB key but we lost a lot of valuable time doing this.

PirateBox has been around for ages as an anonymous drop box (not DropBox!). Basically it’s software and configuration using dirt cheap TP-Link routers to provide a wifi access point with connected storage and files. Version 1.0 was just released which is a ground-up rebuild.

So at events you could tell everyone to connect to one or more of these PirateBoxes, where you and the various presenters/talkers have pre-uploaded all of the relevant material and assets, so they can be quickly accessed and downloaded.

In addition to file sharing, it also has chatting, message boarding, and media streaming. I’d love to check out how many people it can support simultaneously.

As a $35 self-build exercise, it seems like a no-brainer.

01 Jun 2014, 15:48

Creating a valid ops file in JSON format for Minecraft 1.7.9

Our 8yo wanted a local server on his laptop today so I grabbed the latest 1.7.9 server exe from the Minecraft site and ran it in its own directory. It created the usual files and gave a GUI-style interface with no obvious console like the raw Java version. Fionn then asked me to make him an Op so he could change things. And it was all downhill from there.

The standard ops.txt was missing but there was an ops.json containing just two square brackets. No amount of messing with creating an ops.txt or putting his username inside the brackets of the json file worked. Over an hour of googling finally got me close. In the recent versions of Minecraft, they have moved to using JSON instead of .txt for config files and they are rolling out the idea of UUIDs so that you can easily change your username. Both are reasonable changes but implemented terribly in this case. Unless the format of your json file in 100% correct, it doesn’t throw an error, it just deletes it!

So after more than 10 attempts, this is what you need:

Find your UUID. Note that several sites claim to generate this for you but they are missing the dashes in the UUID. The one that worked for me was this one. Enter your username and get something like 54d61e19-71cc-477d-8215-8a11c41f5211 back.
Edit ops.json in a text editor and replace the contents as follows:

[
  {
    "uuid": "54d61e19-71cc-477d-8215-8a11c41f5211",
    "name": "bobloblaw",
    "level": 4
  }
]

If you need more than one Op, do it like so. javascript [ { "uuid": "54d61e19-71cc-477d-8215-8a11c41f5211", "name": "bobloblaw", "level": 4 }, { "uuid": "44d61e69-6166-444d-8665-6a11567f5211", "name": "mrsmines", "level": 4 } ]

That should be all you need. Leave a comment if any problems with it.

21 May 2014, 07:08

My presentation from 3D Camp Limerick. Hacking by Google Search - Building hardware with no electronics knowledge

I’ve been meaning to go to 3D Camp since it first started. It has been about a lot more than 3D for many years. I finally did it on Saturday and had an absolute blast. I met people I’ve been tweeting at since 2007 and really got the full BarCamp mojo whilst I was there. James Corbett, the organiser, was kind enough to mention that I helped to organise the first BarCamp in Ireland with Walter and Damien way back in 2006. My only regret is that instead of a BarCamp in every town and village of Ireland every weekend, we now only have 3D Camp. I hope people will realise again soon how important it is to have regular free unconferences with no one hawking their wares, just sharing knowledge, the love of technology and what it can do for people.

OK, rant over, here’s my deck. It’s called Hacking by Google Search - Building hardware with no electronics knowledge. It’s basically a summary of all the messing I’ve been doing since mid-2012 when I got the electronics bug again. Anyone who follows this blog regularly will know most of it. But this one has animated gifs and some extra thoughts. It’s also all HTML, done using a tool called reveal.js. I really enoyed using this approach and it means it can be hosted on GitHub Pages, just like this blog. Well worth checking out for your own decks. You only need minimal HTML knowledge.

13 Apr 2014, 10:48

My Pebble controlled electric blanket using CloudPebble and Simply.js

Messing with hardware and software can be both a joy and incredibly infuriating. Yesterday I spent a few intermittent hours trying to get Arduino and Espruino communicating reliably over NRF24L01+ wireless transceivers. I only half succeeded. I followed this with some reading up on both CloudPebble and Simply.js. Literally 5 minutes later I had my first Pebble App installed on my watch and was remotely controlling the electric blanket in our bed.

Bandon Blanket

CloudPebble is a web-based IDE for building Pebble Apps. I believe it was started independently but is now run by Pebble themselves. Simply.js is a library/SDK for building Pebble Apps using only JavaScript. Pebble recently announced that CloudPebble now supports Simply.js and boy they weren’t kidding. It’s now childsplay to build Pebble Apps.

This is how quickly I built the App:

Create a free Pebble Developer account
Login to CloudPebble using your Pebble credentials
Create a project and pick Simply.js as the type
You automatically get the sample Simply.js App in the IDE
Compile that and send it to your Watch. You of course need the Pebble App on your phone with developer options enabled. The Pebble App will also tell you your phone’s IP address which CloudPebble may ask for.
I’ve found that I can’t install from desktop but if I load the IDE URL on my phone I can do it from there. I assume this is because I’m on house wifi and behind a firewall with a private IP.

That first App looks like this:

console.log('Simply.js demo!');

simply.on('singleClick', function(e) {
  console.log(util2.format('single clicked $button!', e));
  simply.subtitle('Pressed ' + e.button + '!');
});

simply.on('longClick', function(e) {
  console.log(util2.format('long clicked $button!', e));
  simply.vibe();
  simply.scrollable(e.button !== 'select');
});

simply.on('accelTap', function(e) {
  console.log(util2.format('tapped accel axis $axis $direction!', e));
  simply.subtitle('Tapped ' + (e.direction > 0 ? '+' : '-') + e.axis + '!');
});

simply.setText({
  title: 'Simply Demo!',
  body: 'This is a demo. Press buttons or tap the watch!',
}, true);

And it you read the Simply.js site, you can access URLs like this:

ajax({ url: 'http://simplyjs.io' }, function(data){
  var headline = data.match(/<h1>(.*?)<\/h1>/)[1];
  simply.title(headline);
});

Merge the two together and ta-daaaaa, the Bandon Bed Button Pebble App.

console.log('Bandon Bed Button!');

simply.on('singleClick', function(e) {
  console.log(util2.format('single clicked $button!', e));
  if (e.button == "up"){
    ajax({ url: 'http://url-of-conors-thing-to-turn-on' }, function(data){
      simply.subtitle('Turned on Blanket!');  
    });    
  } else if (e.button == "down"){
    ajax({ url: 'http://url-of-conors-thing-to-turn-off' }, function(data){
      simply.subtitle('Turned off Blanket!');  
    });        
  }
});

simply.setText({
  title: 'Bandon Bed Button!',
  body: 'Press Up to turn on the blanket. Press Down to turn off the blanket.',
}, true);

Just in case the data route for this is not clear, here’s what it looks like:

From Pebble to Switch

I referred to this on Twitter last night as The Internet of Thangs. I’m already sick of The Internet of Things. The IoT hype cycle is completely out of control before the first killer apps/products have even been created. Big Data has become Big Bullshit. I sincerely hope that the winner in this area hasn’t even incorporated as a business yet. If we leave it to Google, Cisco, GE or any of the other behemoths, we’ll end up with whatever they read in the last Suarez book rather than something that is a fundamental change in how we benefit from technology. The Pebble is never going to end up on everyone’s wrists but by providing simple tools that any idiot like me can play with, it could be the seed that eventually leads to something huge. cf Altair, ZX Spectrum, BBC Micro, ARM, Arduino, Raspberry Pi.

30 Mar 2014, 13:36

Interfacing Node.js to Raspberry Pi hardware like 433Mhz transmitters with node-ffi

Now that I can talk to the Efergy remote control switches using Arduino and DigiX, it’s time to do the same thing with Raspberry Pi and Node.js.

But first, check out this gorgeous Pimoroni Pibow Timber case that the RPi now lives in. Makes me smile every time I look at it.

Pibow

Start by plugging in one of the Efergy switches and using its remote control to set its code. Use the sniffer program from the last post to find out what that code is.

Connecting the 433MHz transmitter to the RPi is dead easy. Connect GND to pin 6, VCC to PIN 1 and DATA to PIN 11. More details on the WiringPi Pinout. For our purposes: BCM GPIO 17 == WiringPi 0 == Pin 11 == GPIO #0

The simplest way of doing this is to create a command-line program in C or C++ and then shell out to it in Node. But where’s the fun in that? The core approach I decided to use was via a Node module called Node FFI which makes it relatively easy to call out to C libraries from Node. The alternative is to build a Node module in C++. But that wasn’t going to work, given that I haven’t written a line of C++ in my life, after the trauma of doing a training course in it in the late 90s and seeing the horror inflicted on my beloved C.

Unfortunately I quickly realised that node-ffi only does C whilst the RCSWitch-Pi library is C++ (but the WiringPi library under that is C again!).

An abortive attempt at using ffi-generate went nowhere as it wouldn’t even run on the RPi due to some libclang problem. Running in an Ubuntu VM worked after a lot of llvm messing but I couldn’t make any sense of the output.

Cue a lonnnngggg session Googling how to interface C to C++ and realising that I hate C++ even more now :-) Eventually I found an idiot’s guide to wrapping C++ so it can be called by C. Bit by bit I figured it all out but hit a brick wall at one point. This forced me to use the GDB debugger for the first time in 15 years. Luckily I spotted the problem using GDB in about a minute flat and got a test C program working which would call the C++ functions correctly.

Then it was back to Node.js and node-ffi for another round of head scratching and non-stop errors. Finally it all clicked and the Efergy module clicked too!

Now that I have the basics working, I will clean up this code and use it in the Node server that will run on the RPi via the usual exports/require approach. I’ll also hopefully be able to apply the same approach to any C++ or C library that accesses any RPi hardware.

The final set of steps are actually quite simple. All the effort was in the learning.

Install WiringPi

mkdir ~/gitwork
cd gitwork
git clone git://git.drogon.net/wiringPi
cd wiringPi
./build

### Install RCSwitch-Pi
cd ~/gitwork
git clone https://github.com/r10r/rcswitch-pi.git
cd rcswitch-pi
make

Build an RCSwitch-Pi library

gcc -shared -fpic RCSwitch.cpp -o libRCSwitch.so

Simple C++ test to make sure you can talk to Efergy

cd ~/gitwork/rcswitch-pi

Create a file called efergy.cpp with the following contents

#include "RCSwitch.h"
#include <stdlib.h>
#include <stdio.h>

int main(int argc, char *argv[]) {

    /*
     output PIN is hardcoded for testing purposes
     see https://projects.drogon.net/raspberry-pi/wiringpi/pins/
     for pin mapping of the raspberry pi GPIO connector
     */
    int PIN = 0;
    int unitCode = atoi(argv[1]);

    if (wiringPiSetup () == -1) return 1;
        printf("sending unitCode[%i]\n", unitCode);
        RCSwitch mySwitch = RCSwitch();
        mySwitch.enableTransmit(PIN);

        mySwitch.send(unitCode,24);

        return 0;
}

Now compile and run it:

g++ -c -o efergy.o efergy.cpp
g++ RCSwitch.o efergy.o -o efergy -lwiringPi
sudo ./efergy 109011
sudo ./efergy 109019

If all is working ok, the switch will turn on and off.

Install Node/NPM on Raspberry Pi

    cd ~
    sudo mkdir /opt/node
    wget http://nodejs.org/dist/v0.10.23/node-v0.10.23-linux-arm-pi.tar.gz
    tar xvzf node-v0.10.23-linux-arm-pi.tar.gz
    sudo cp -r node-v0.10.2-linux-arm-pi/* /opt/node
    cd ~
    nano .bash_profile
     
    #Add these lines to the file you opened
    PATH=$PATH:/opt/node/bin
    export PATH

    #Save and exit
     
    #Test
    node -v
    npm -v

Wrapping C++ so it can be called by C

Create wrapper.h and wrapper.c

wrapper.h:

#ifndef __MYWRAPPER_H
#define __MYWRAPPER_H

#ifdef __cplusplus
extern "C" {
#endif

  typedef struct RCSwitch RCSwitch;

  RCSwitch* newRCSwitch();

  void RCSwitch_send(RCSwitch* v, unsigned long Code, unsigned int length);

  void RCSwitch_enableTransmit(RCSwitch* v, int nTransmitterPin);

  void deleteRCSwitch(RCSwitch* v);

#ifdef __cplusplus
}
#endif
#endif

wrapper.c:


    #include "RCSwitch.h"
    #include "wrapper.h"
    #include <stdio.h>

    extern "C" {
      RCSwitch* newRCSwitch() {
        //printf("Inside newRCSwitch");
        return new RCSwitch();
      }

      void RCSwitch_send(RCSwitch* v, unsigned long Code, unsigned int length) {
        //printf("Inside RCSwitch_send");
        v->send(Code, length);
      }

      void RCSwitch_enableTransmit(RCSwitch* v, int nTransmitterPin) {
        //printf("Inside RCSwitch_enableTransmit");
        v->enableTransmit(nTransmitterPin);
      }

      void deleteRCSwitch(RCSwitch* v) {
        delete v;
      }
    }

Compile everything

Use -g flag everywhere if you want to debug with GDB (gdb efergy2, break main, run 109011, next, next). Simple GDB Tutorial here and some tips on RPi Forums.

g++ -g -c wrapper.c -o wrapper.o
gcc -g -c efergy.c -o efergy.o
g++ -shared -fpic -g RCSwitch.cpp -L ./ -l wiringPi -o libRCSwitch.so
g++ -g efergy.o wrapper.o ./libRCSwitch.so -o efergy2

Run the C program

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:./
sudo ./efergy2 109011

Now to get everything running in Node/JavaScript

Install Node FFI

sudo npm install -g ffi

Create send.js

var ffi = require("ffi")

var libwiringPi = ffi.Library('/usr/local/lib/libwiringPi', {
    'wiringPiSetup' : [ 'int', [] ],
    'digitalWrite' : [ 'void', ['int', 'int'] ],
    'pinMode': [ 'void', ['int', 'int'] ],
    'delayMicroseconds' :  [ 'void', ['int', 'int'] ]
})

var libRCSwitch = ffi.Library('./libwrapper', {
  'newRCSwitch' : ['pointer', [] ],
  'RCSwitch_send': ['void', [ 'pointer', 'int', 'int' ] ],
  'RCSwitch_enableTransmit': ['void', ['pointer', 'int'] ]
})

if (process.argv.length < 3) {
  console.log('Arguments: Efergy Switch Code')
  process.exit()
}

var PIN = 0;
var unitCode = parseInt(process.argv[2]);
var mySwitch = libRCSwitch.newRCSwitch();

if (libwiringPi.wiringPiSetup() == -1){
    return 1;
    printf("Error initialising WiringPi");
}

if (mySwitch.isNull()) {
    console.log("Oh no! Couldn't create object!\n");
} else {
    libRCSwitch.RCSwitch_enableTransmit(mySwitch, PIN);
    libRCSwitch.RCSwitch_send(mySwitch, unitCode, 24);
}

Compile the libraries

g++ -shared -fpic -g wrapper.c -L ./ -l RCSwitch -o libwrapper.so
g++ -shared -fpic -g RCSwitch.cpp -L ./ -l wiringPi -o libRCSwitch.so

Run send.js

Note that anything which uses WiringPi has to be run sudo. A bit of a pain. Not sure I want to be running the Node process as root?

sudo env PATH=$PATH LD_LIBRARY_PATH=$LD_LIBRARY_PATH:./ node send.js 109011

That’s it. You can now turn on and off your Efergy socket using Node. Next up, a simple Node server to handle remote requests to do this.

23 Mar 2014, 15:10

Our web and mobile enabled electric blanket using Electric Ireland's Efergy RC sockets

A major ongoing struggle in our household is convincing one of our many children to go upstairs and put on the electric blanket for us. I honestly don’t know how we get through the winter.

Wouldn’t it be fantastic if we could use technology instead of children to do this?

Last Thursday I came home from work to discover that Electric Ireland had sent us three Efergy remote control power sockets and a plugin energy monitor. I have no idea why they did so, apart perhaps from our crazy electricity use.

Efergy

My mind immediately leaped to our electric blanket conundrum.

A quick look at the label and I was pleased to see they worked at 433Mhz. What were the chances that the $1.99 wireless modules I bought from DX in China would work with them? Oh the joy to discover that, not only do the modules work with them, some genius has already built a powerful Arduino library to do so!

It turns out that the Efergy brand uses the same simple protocol as a wide range of others around the world. I guess they are all OEMing something.

A 2 minute setup on my Arduino Uno with a receiver and the relevant sniffer code loaded up and I had the 11 codes (5 on, 5 off, 1 all-off) that my specific remote control used. They are all 24bit, protocol “1” with 6 decimal digits like 109223. I assume they distribute a range of remotes to avoid neighbours clashing. The RC switches themselves are not locked to a specific code, they have to be set by the remote control at power up.

2 more minutes to setup a transmitter to cycle between the codes and I was toggling the RC switches on and off.

I really was gobsmacked how easy it was. It was a lovely reminder of how incredible the internet really is. We’ve become so used to it, we forget that the above would have taken me months (or never!) in pre-internet days instead of 10 minutes total.

Of course that was just step 1. What I really wanted to do was internet-enable these beauties. I’ve been meaning to do something along these lines with the DigiX board since I got it. The DigiX is a really powerful Arduino clone with a 32-bit Atmel ARM CPU, tons upon tons of IO, SD card, audio and, most importantly for this project, Wifi on-board.

frontside

Getting Wifi setup the way I wanted on the DigiX took more time than I’d have liked yesterday but I finally figured out that the Chinese USR-wifi232-g module web UI doesn’t like Chrome! Once I switched to Firefox, all went swimmingly. I connected it to a Wifi AP, gave it a fixed IP address and then setup a port forward from the house router to it. Obviously I use Dynamic DNS with Dyn on the home router so that I don’t have to track changing IP addresses manually.

backside

The DigiX comes with an example Web Server built-in so I was able to just create a few simple URLs and paste the RCSwitch code into the relevant places. So within another few minutes I was toggling the switches over the internet.

switch1State = 1;
mySwitch.send(109323, 24);

Another moment when I forced myself to stop and think about how incredibly easy all this stuff has become. And all of it sitting on the shoulders of giants. So far, all I’ve done is copy and paste stuff together.

But of course I wasn’t happy there. I wanted a mobile app! Luckily, Colum Bennet in FeedHenry has created a superb new Getting Started guide which will be released soon. It walks you through creating an Angular Hybrid mobile app which talks to a FeedHenry Node.js back-end to do a really simple task. Despite only playing a programmer for the purposes of blogging, I was able to easily modify both the client and cloud apps to create what I needed with the Node request module.

Angular

Of course I could send the commands directly from the mobile app to the DigiX, but routing it through the cloud means that eventually this can be multi-user, multi-switch, re-configurable and securely locked-down.

Success

One really nice feature of the FeedHenry platform is that you can generate hosted Web Apps from your mobile Apps. This means that whilst I now have a lovely Android APK on my Galaxy S4, my wife can use the web app on her iPhone to achieve exactly the same thing using a simple bookmarked CNAME.

Web App

I’ll upload everthing on to a public GitHub repo as soon as I have removed all the hard-coding and added a modicum of “security” with API keys etc. More posts on this technology combo coming in the next few weeks.

With the Bandon Bed Button, we have now achieved Electric Blanket Nirvana.

16 Mar 2014, 18:33

The nitty-gritty of moving from WordPress to HarpJS

As I said in the previous posts, the move from WP to HarpJS was not exactly smooth. But I’m glad I’ve done it as I finally learned how to use a lot of things like Jade that I’ve long-fingered for years.

The first step was to get the original content out of WordPress. This had the added twist that the conoroneill.net blog actually started out as a Posterous blog which I imported into WordPress when I realised that Posterous was a dead-end. So the first two years of posts have some “interesting” problems in them.

The tool I used for import was wpJson4Harp as recommended on the HarpJS site. These were the notes I took on that part of the process:

It doesn’t work off a WXR export or WP APIs but actually needs access to your WordPress DB. This is not a goer if you have shared hosting with the MySQL access locked down. I had to dump my WP DB and load it into a VirtualBox Ubuntu VM and run the import tool there.
As I mentioned above, the older Posterous posts had problems
- Lots of embedded YouTube Videos not working
- I was gutted to discover that the Posterous to WP import had not moved over the images. So now that Posterous is gone, so are those images. A harsh lesson learned. I’m just imagining how exposed the average Tumblr person is to the same problem
- Posterous seemed to use 4 spaces in formatting quite a bit. So of course Harp thinks this is source code in the generated MD files and formats appropriately. I manually did a search/replace on this to convert all sets of 4 spaces to 1. Seems to have done the trick
I’ve been using a YouTube plugin which lets me embed videos by using a fake url with the prefix httpv. None of these work. Probably won’t be that hard to do a search/replace on those eventually.
The “blog index” file that wpJson4Harp generated (_data.json) is quite crude. It has JS timestamps rather than human readable ones and the posts are not in chronological order
I ended up making some changes to its Python source to simplify its output and make it more “blog like”
I downloaded all the files from wp-content in WordPress and uploaded them to my S3 account using Cloudberry Explorer. I then did a search replace across all of the MD files to replace my conoroneill.net URLs with https://s3-eu-west-1.amazonaws.com/

So once I had all of the old blogposts as MD files, it was time to get them into some sort of HarpJS blog. I managed to install Harp on Windows after a lot of messing with Visual Studio Express (both 2010 and 2012 now installed) and node-sass. Actully node-saas just seems to be a general pain in the arse and I hope people start using something else that doesn’t require native compilation.

Unfortunately, even tho Harp installed, it doesn’t work correctly on Windows if you are using for GitHub Pages. It throws an error about bad paths if you do harp compile _harp ./

So I moved over to my Ubuntu VM and setup everything there instead. The instructions for using GitHub Pages as the host on the Harp site work perfectly for Ubuntu.

As Harp is pretty new, there are very few templates/themes for blogs. The general web-site ones seem good and I’ll be trying them out after this. So I ended up using the Baseline template and that’s where the heavy work began. Here’s all the problems/challenges/learnings I went through over the space of several weekends:

Template assumes /blog for blog but that’s not what I use, I need it to be in the root. Some simple playing with the templates seemed to take care of that.
I didn’t use year/month/day in my original URLs on Posterous (and then WordPress) so I now have 650 md files and html files in one directory each. I’ll move to date based directories from now on, once I figure it out.
The default index.html (_layout.jade) just lists every post in the order it finds them in _data.json but that’s completely un-blog-like. I had to implement a bunch of JavaScript in the Jade template to sort the posts by reverse chronological order. As I’m not a JavaScript programmer, this ended up being several days of grabbing bits of code off Stack Overflow and sticking console.log in everywhere until I finally got it working. Learned a lot but it nearly ended up being a show-stopper and I came very close to giving up. You can see the main changes here (don’t laugh!)
As a result I realised that Jade is an extremely limited templating language. I can see why a lot of people use EJS instead. I had to drop to raw JavaScript to do most things e.g. converting the JS timestamps to human readable format
Stylus is a reasonably nice way of doing CSS but of course you can’t use line numbers in Chrome Inspector to see what line in the Stylus template has to change. But the upside is that I know have a basic understanding of Stylus
I learned how to do Google Web Fonts which was neat
There is no pagination in the index.html template and I cannot find a “recipe” for this even in a wider Jade context.
I had to delete some references to Twitter etc in the templates as they kept causing Jade errors. I think it’s some cross-referencing problem between author in _data.json and the config file for Twitter.
The home page should not be 650+ post titles, it should be 5-10 full posts, like a normal blog. I haven’t got started on this yet but it will require cross referencing between _data.json and individual HTML/md files to pull in the content
The RSS feed did not validate for several reasons (JS timestamps, titles). I fixed that so it now validates. But like the home page, it only has my manually entered post summaries rather than my preferred approach of full posts.
The RSS feed was set to rss.xml but I changed to it to /feed so that existing subscribers to my WP blog don’t all lose me in the move
The Disqus integration works well but ideally I’d love an Open Source JS commenting system that stores comments in maybe JSON files on Dropbox
Google Analytics support and the avatar support for GitHub/Twitter/Facebook/G+ is nice.
There is obviously no search built in. Currently I’m using Google Custom Search which works well but is fugly. Is there a possibility to create a simple JS search tool that uses _data.json as a first MVP and later builds a searchable JSON index file as part of the compile process?

So I now have something that almost works the way I want it. Uploading images to S3 is no real strain and I’ll be able to make all the other changes I want incrementally.

The biggest ongoing annoyance is that “harp compile” deletes all of the generated HTML and re-builds everything from scratch every time. This isn’t scalable when you have 650+ posts and it means that when I add a new blogpost I have to wait 2-3 minutes for it to recompile. There should be an incremental option which assumes no structural/styling changes have happened and just looks for new md files to compile and then rebuilds index.html. Good old “Make” had this nailed a long time ago ;-) Actually cpould git not be leveraged for this? Compare the current working tree with the last commit?

That HTML deletion is also a problem for using a DNS CNAME with GitHub Pages since it deletes that file from the root directory every time. Update: OK, I just sorted this with two minutes effort. Created CNAME file in _harp and added it with layout: false to _data.json. The compile process then copies it over to root.

It would be great to have a simple cli tool called new-blog-post which asks for a title, date and summary and updates _data.json and creates the empty .md file. I could probably write this myself but I’ve 500 other things I need to do too. I also must look into adding tagging and a ZX Spectrum header image.

I know the whole process sounds like a huge amount of work. It was! But I think everything above could be sorted with an improved WP import tool that works on WXR files and some kick-ass blog templates rather than web-site templates. I’m really looking forward to using Harp when I port an old web-site next.

Update 1: I forgot to mention a few things:

First was Disqus itself. Adding it as a plugin and using the import feature there didn’t work for comments. But importing the WXR file worked great and all old comments are now visible using Disqus.

Second was the actual switch from WP on Blacknight to GitHub Pages. This was a bit fiddly. I use an non-www domain and learned that you can’t CNAME from conoroneill.net over to conoro.github.io. Luckily I use DNSMadeEasy for DNS and they have a feature called ANAME which does allow this. But before I turned that on, I had to move the WP blog elsewhere. This involved creating a new domain called conoroneillnet.conoroneill.com and associating it to the old WP install on Blacknight. Then I used the Automatic Domain Changer plugin to reset everything in WP and its database to the new archive domain. Then finally I made the DNSMadeEasy change. And ta-daaaa, I’m now on GitHub Pages. I’m sure it’ll take a while for the DNS changes to propagate. I’ll now turn off Google indexing of the old blog to avoid duplicate content issues.

Update 2: Now that everything is moved, I’ve just realised I have a very serious problem - trailing slashes, or more precisely, the lack of them. On WordPress all of my URLs ended with a trailing slash. By default, HarpJS doesn’t since it treats / to mean directory not file. This wouldn’t be so bad if there was a re-direct from one to the other, but there isn’t, the URLs with trailing slashes give a 404. Which basically kills every existing inbound link to the blog (and messes up Disqus to boot). There is a way around it by putting every post in its own directory on that name and renaming every file to index.md. This seems like overkill when an auto-redirect really should be the default. Of course this is a GitHub Pages issue not a Harp issue. Not sure what I’m going to do now……

Update 3: As I want to avoid “downtime”, I bit the bullet and took the directory approach for all old posts. The new ones will be fine. This code did the trick (thanks Stack Overflow) in the _harp dir:


find . -name "*.md" -exec sh -c 'NEWDIR=`basename "$1" .md` ; mkdir "$NEWDIR" ; mv "$1" "index.md"; mv "index.md" "$NEWDIR" ' _ {} \;

Update 4: Damn and blast, I forgot about /feed not redirecting too. So I created a /feed directory and put the feed.jade file in there as index.jade and set layout to false in a new _data.json file in that directory (after trying 20 other things!).

Cross Dominant

Mixed laterality since 1968