About Me

Curriculum Vitae

A brief list of my current skill set

Bloggybits

Gosh This Site Is Old
Thursday, 18th November 2021, 22:08

I might update this one day, but until I do take a lot of it with a pinch of salt!

Automatically Cropping Images is Hard
Monday, 21st October 2013, 19:00

But maybe we can use face detection?

The Git Cheat Sheet
Friday, 6th September 2013, 11:30

for github, bitbucket, that kinda stuff

CoffeeScript and TypeScript are a Burden
Saturday, 17th August 2013, 11:21

Be sure you understand the cons as well as the pros

Changing the Order of the jQuery Event Queue
Wednesday, 3rd July 2013, 20:27

It's just a push to the left

How Do Spammers Get My Email Address?
Wednesday, 15th May 2013, 18:03

I think these days I have a pretty good idea

XSLT, node.js 0.10 and a Fun Two Days of Native Modules and Memory Leaks
Thursday, 25th April 2013, 17:14

documentation makes things less cryptic, so lets not write much of it

Fixing CentOS high cpu usage when running as a virtual machine under VirtualBox
Sunday, 21st April 2013, 20:28

innotek rocks! I mean Sun... I mean Oracle...

Repairing a dK'Tronics Keyboard and Scoping Out a ZX Spectrum 48k - Part One
Sunday, 17th March 2013, 23:51

What signals inside of it actually look like

Tabs vs Spaces and Why You Should Always Use Tabs
Monday, 4th March 2013, 19:51

Spaces are bad, just real bad

Why you should ban Amazon's Cloud IPs
Thursday, 27th December 2012, 14:50

And how to do it in nginx, Merry Christmas Amazon

Building Better jQuery DOM Inserts
Thursday, 20th December 2012, 15:18

Break it down baby

SEO Companies - Don't Waste Your Money
Wednesday, 12th December 2012, 16:16

Spammers by any other name

Pulse Width Modulation and How 1-bit Music Works
Wednesday, 5th December 2012, 23:34

Beep beep multi-channel!

Projects and Sillyness

MAME Cabinet Diary

How I built my own arcade cabinet

Loading Screen Simulator

I don't miss the ZX Spectrum, I still use it!

The Little Guy Chat Room

It's a Pitfall inspired chat room

GPMad MP3

A fully featured MP3 player what I wrote

GP Space Invaders

My first little emulator

GP32 Development Page

Some info and links about this cute little handheld

Disney Nasties

Uncensored images, you must be 18 to view them

Diary of a Hamster

Learn about how hamsters think, first hand

Utilities

Time Calculator

A simple little online utility for working out how many hours to bill a client

A Few Links

XSLT, node.js 0.10 and a Fun Two Days of Native Modules and Memory Leaks
Thursday, 25th April 2013, 17:14

What a fun week it's been! Let's start near the beginning, with the final jump of upgrading to node.js I took this week with the various sites I run. First job of the day, upgrade my test server and see what broke. And what did break? Well a few modules, so carefully updating them one by one, less and less broken things. Then I get to the first major hurdle, the XSLT module I've relied on for a while.

See, the approach I've been taking to making custom bespoke CMS systems quickly, has been to utilise XML and XSLT templates for controlling content. Sure you could take the Wordpress route of letting anyone do anything they want with a page, often including hackers... seriously using Wordpress for your website these days is like having a Yahoo or Gmail account for your business contact address, spend a bit more money and get something done properly!

Oh and if you still build websites using a LAMP stack, I feel for you. PHP is awful and deserves to die. I have nothing against MySQL other than historical dislike for its lack of standards and late support for true relational structures, heck I can even understand why someone would run Apache these days, even though it is an archaic old dragon with a dreadful config file format in comparison to something sleek and modern like nginx.

But PHP has always been the VisualBasic of the web development world, apart from being free and having a low entry bar for budding programmers, it is awful. The flat scripting model is outdated for modern needs, the language is one of the worst most inconsistent piles of garbage I've ever come across, and basically whilst people have done good things with it, plenty of great artists have done good things with total garbage. But it's still garbage.

Where Was I? Oh Yes node.js Upgrade

So anyway, there is little point having a designer spend ages making a website look good, if an over-eager user then turns up and starts using strange font sizes, or even worse creating a huge mess with the layout, and either thinking it looks better when it just looks unprofessional, or in even more frequent cases not even noticing what they did.

The answer here is to restrict what they can do, don't let them control the formatting, and XSLT templates are a handy way to do this. Even if XSLT itself is dreadful and almost deliberately created to be impossible to read and follow even with syntax highlighting. But it will do.

CMS sites I build have two templates for an item of content, sometimes generic items like a paragraph, other times a whole detailed collection of content. One XSLT template is used to present the XML data users enter as a HTML edit form, the other is used to display the XML as final HTML content. Converting the submitted edit form into XML is a trivial task.

For XML, node.js offers a number of modules, but for XSLT I could only ever find just one, and that wasn't even listed on the third party modules page I only discovered it through StackOverflow via Google.

Node 0.10 broke it, and the author seemed a tad AFK. So what to do? Well actually it didn't look that complicated a module, so I figured I'd never made a native C/C++ module for node.js, perhaps I'll throw a day at this and see if I can make one. Thus libxsltjs was born.

It turns out that V8 has been made as complicated and confusing to muck around with as MAME, for the very same reasons. Templates and bad developer documentation! Don't get me wrong, I'm not against templates at all, but they do introduce a level of obfuscation that macros also like to add.

Using the node_xslt and far-too-sparse-for-a-company-the-size-of-Google V8 documentation as a reference, I slowly began to decode the macros used in the former and the few examples of the latter, with everything working fine at the end.

New module created, connect replaced with fastworks.js on all my remaining sites, because yet again another connect update caused something to break which was a big motivation for me writing my own framework in the first place.

PostgreSQL Starts Imploding

Alarms begun to go off after a few days post upgrade, and they all pointed to the database server. Logs are spamming "FATAL: the database system is in recovery mode" at me, which after some googling seems to indicate a hardware issue. Hardware issues are really not what any system admin wants to read, but a reboot of the box appeared to fix things.

At least for a day, then the same thing happens again, so this isn't a one off, this is a serious new issue. After a bit more digging I discovered that the reason PostgreSQL was going into recovery mode related to an interesting feature that CentOS has with regard to low memory situations.

Basically, when the box is running out of memory, a monitoring process starts to look around for something to KILL. And it was choosing a PostgreSQL process, which made PostgreSQL panic and think it had crashed, which resulted in a distrust of the database integrity, that kicked off recovery mode.

So now knowing the cause of the implosion was a low memory condition, and that this was occurring at least once every 24 hours, all the signs were pointing to a dreaded memory leak. And it didn't take me long to find a culprit, as I watched node processes grow in memory usage higher and higher as time went on.

Boot the Test Server!

Starting up a node app on my test server, I started to use siege to throw some connections at it. Just 1000 of those caused the resident memory usage to rise from it's initial 30m to well over a hundred, after 3000 or so it was in the 250m zone. Leaving it for a few minutes showed that was never going down.

I suspected I knew what the cause of this memory leak was too, so I disabled all XML and XSLT commands from the test site, and ran the siege again. This time, no significant growth at all, it was definitely my new node module. :(

But I had a good idea why, I'd sort of assumed that when I passed an object to Javascript V8 it would free that itself, but it doesn't if this object wasn't created by the V8 library. Why would it? How would it know my module didn't need it anymore let alone the right way of freeing it?

So even more googling and I discovered I have to create a persistent handle, which is what V8 uses to reference objects that are passed to Javascript but never cleared up by the garbage collector. And then, you have to mark it as a weak handle, and provide a callback function for when it is no longer referenced in JS.

What kind of stupid way of doing things is that? A weak persistent handle? Surely there were better ways to approach that whole issue? It looks very much like something which evolved rather than was planned.

Getting the thing to compile was a nightmare and documentation on the thing is almost Facebook level awful, eventually I had to take inspiration from the node.js source code itself on how to do things, but finally I managed it and the memory leak was cured.

Just as well, since as things stood I was restarting 10 or so node processes every 30 minutes just to keep the memory level low! There is nothing like a constant reminder that you HAVE to fix something, to put unnecessary pressure on you fix it. :)

Comments

Add Your Own Comment