Getting base64 images from PostgreSQL bytea field
We were using node js to convert bytea images using a simple algorithm. However, having requests holding about 100MB of images in them quickly became a performance problem as our server would time out before being able to process all the images.
bytesBufferToBase64(buffer) {
let binary = '';
const bytes = new Uint8Array(buffer);
const len = bytes.byteLength;
for (let i = 0; i < len; i += 1) {
binary += String.fromCharCode(bytes[i]);
}
return btoa(binary);
}
We tried postgres-bytea node as a possible solution, but the same problem remained. Let’s take a glimpse at the source code:
module.exports = function parseBytea (input) {
if (/^\\x/.test(input)) {
// new 'hex' style response (pg >9.0)
return Buffer.from(input.substr(2), 'hex')
}
var output = ''
var i = 0
while (i < input.length) {
if (input[i] !== '\\') {
output += input[i]
++i
} else {
if (/[0-7]/.test(input.substr(i + 1, 3))) {
output += String.fromCharCode(parseInt(input.substr(i + 1, 3), 8))
i += 4
} else {
var backslashes = 1
while (i + backslashes < input.length && input[i + backslashes] === '\\')
for (var k = 0; k < Math.floor(backslashes / 2); ++k) {
output += '\\'
}
i += Math.floor(backslashes / 2) * 2
}
}
}
return Buffer.from(output, 'binary')
}
The problem in both cases was that our Node.js server was basically looping over the entire bytea. At about 100MB of data, that’s about 100 million loops in our software. 🤯 No wonder the server was timing out!
The root of the problem is that PostgreSQL bytea format, containing escape characters (i.e 123\000\001) is not very compatible with Node.js. It is simply not a native format for the Buffer class, hence the conversion.
Funny thing is that the very existence of a package like postgre-bytea encourages users to think this is the right way to do it. Of course most people using bytea fields would use it for thumbnail or small images, so I guess they would rarely see the performance impact on their node-js software like we did.
If Node-js is bad at handling bytea coming from PostgreSQL, then why use the bytea format?
Think outside the box!
What if the conversion to base64 was done at the source? No bytea, no problem eh?
Well it’s possible, and very easy:
SELECT encode(data, 'base64') FROM image where id = 145
Et voilà!
The conversion is done on the spot with optimised performances (I trust PostgreSQL team on that) and the database is delivering base64 directly into our Node.js software. It’s not even faster, there is no conversion needed at all, there is less code, making for better code!