Adding Multi-File Type Support to the Blog Encryption System
The Problem
In previous posts, we built an end-to-end encryption system for the blog: Markdown text is encrypted with AES-256-GCM, images are encrypted separately into .enc files, and everything is decrypted and rendered in the browser. This setup works perfectly for standard blog posts, but it has one fundamental limitation—it only handles .md files.
In practice, a lot of content isn't a natural fit for Markdown:
- A research paper draft is already a PDF.
- An experiment report is most intuitive as a Jupyter Notebook.
- A tech talk presentation is a PPTX file.
- Mathematical derivations written in LaTeX lose formatting when converted to Markdown.
Forcing a conversion to Markdown would destroy layout and interactivity, while treating them as simple attachments ruins the "read online" experience. I need the encryption system to natively support these file types.
Design: A Type-Based Dispatch System
Since a one-size-fits-all process is impossible, the solution is to dispatch based on file type. The core design is this: at build time, perform different preprocessing steps based on the file type, encrypt everything into a standard JSON format with a type flag, and on the client-side, choose a rendering method based on that flag.
| File Type | Extension | Build-Time Action | Client-Side Rendering |
|---|---|---|---|
| Text | .md |
Encrypt raw Markdown | marked.js → HTML |
| Notebook | .ipynb |
Convert to HTML → Encrypt | Inject HTML directly |
| Native PDF | .pdf |
Encrypt raw binary | Blob URL + <iframe> |
| Convertible | .tex .docx .pptx etc. |
Convert to PDF → Encrypt binary | Same as PDF |
The key takeaway: All file types share the same two-layer key system (PBKDF2 + AES-256-GCM). The changes are confined to the preprocessing and rendering stages, leaving the encryption core untouched.
Build Time (Node.js)
══════════════════════════════════════════════════════════════
.md ─────────────────────→ Encrypt text ─→ slug.json (type: markdown)
.ipynb ──→ convertNotebook → Encrypt HTML ─→ slug.json (type: html)
.pdf ────────────────────→ Encrypt binary ─→ slug.json (type: pdf)
.tex ───→ pdflatex ──┐
.pptx ──→ soffice ───┤──→ Encrypt binary ─→ slug.json (type: pdf)
.docx ──→ soffice ───┘
Client-Side
══════════════════════════════════════════════════════════════
slug.json ──→ User enters password ──→ Decrypt slot with PBKDF2 ──→ Decrypt with AES-GCM
│
┌───────────────────────────────────────────────────┼──────────────┐
▼ ▼ ▼
type: markdown type: html type: pdf
marked.parse() Directly inject Blob URL
▼ ▼ ▼
Render HTML Render notebook <iframe>
Build-Time Implementation
File Discovery: From Markdown to All Types
The first step is to teach the build script to recognize new file types. The original findMarkdownFiles is upgraded to findEncryptableFiles:
const TEXT_TYPES = ['.md'];
const NOTEBOOK_TYPES = ['.ipynb'];
const PDF_TYPE = ['.pdf'];
const PDF_CONVERTIBLE = ['.pptx', '.docx', '.rtf', '.tex', '.odt', '.ods', '.odp'];
const ALL_SUPPORTED = [...TEXT_TYPES, ...NOTEBOOK_TYPES, ...PDF_TYPE, ...PDF_CONVERTIBLE];
When scanning the _content/crypt-posts/ directory, the script now matches all files with extensions in ALL_SUPPORTED.
Metadata Source: Frontmatter vs. .meta.json
Encryption settings for Markdown files are defined in YAML frontmatter, which is fine. But binary files don't have frontmatter, so we'll establish a convention: non-Markdown files will use a corresponding .meta.json file for their metadata.
_content/crypt-posts/gol/
├── alife-paper-draft.pdf # PDF source file
├── alife-paper-draft.meta.json # Metadata
├── 2026-02-25-cross-experiment-en.md # Markdown (with built-in frontmatter)
The format of .meta.json mirrors the fields in the frontmatter:
{
"title": "Paper Draft",
"date": "2026-02-26",
"encrypted": [{ "key": "gol", "hint": ["fenda"] }],
"series": "gol"
}
The build script's getEncryptedMeta() function unifies the logic for reading from both sources:
function getEncryptedMeta(filePath) {
const ext = path.extname(filePath).toLowerCase();
if (ext === '.md') {
// Markdown: Read from frontmatter
const { frontmatter, body } = parseFrontmatter(fs.readFileSync(filePath, 'utf-8'));
if (!frontmatter.encrypted) return null;
return { encrypted: frontmatter.encrypted, body, ext };
}
// Other files: Read from .meta.json
const metaPath = filePath.replace(new RegExp(`\\${ext}$`), '.meta.json');
if (!fs.existsSync(metaPath)) return null;
const meta = JSON.parse(fs.readFileSync(metaPath, 'utf-8'));
if (!meta.encrypted) return null;
return { encrypted: meta.encrypted, ext };
}
Content Preprocessing Dispatch
This is the most critical change on the build side—branching the processing logic based on the file extension ext:
if (TEXT_TYPES.includes(ext)) {
// Markdown: Process image references + encrypt raw text
const processedBody = processMarkdownImages(meta.body, filePath, articleKeyString, slugDir);
contentBuffer = Buffer.from(processedBody, 'utf-8');
contentType = 'markdown';
} else if (NOTEBOOK_TYPES.includes(ext)) {
// Notebook: Convert to HTML at build time, then encrypt the HTML
const html = convertNotebook(filePath);
contentBuffer = Buffer.from(html, 'utf-8');
contentType = 'html';
} else if (PDF_TYPE.includes(ext)) {
// PDF: Encrypt the raw binary directly
contentBuffer = fs.readFileSync(filePath);
contentType = 'pdf';
} else if (PDF_CONVERTIBLE.includes(ext)) {
// LaTeX/Office → Convert to PDF first → Then encrypt the binary
const pdfPath = convertToPdf(filePath);
contentBuffer = fs.readFileSync(pdfPath);
contentType = 'pdf';
}
Format Conversion
We'll use pdflatex for LaTeX and LibreOffice in headless mode for Office documents:
function convertToPdf(filePath) {
const ext = path.extname(filePath).toLowerCase();
if (ext === '.tex') {
execSync(`pdflatex -interaction=nonstopmode -output-directory="${tmpDir}" "${filePath}"`,
{ timeout: 60000, stdio: 'pipe' });
} else {
execSync(`soffice --headless --convert-to pdf --outdir "${tmpDir}" "${filePath}"`,
{ timeout: 60000, stdio: 'pipe' });
}
}
The -interaction=nonstopmode flag prevents LaTeX from waiting for interactive input on compilation errors, and --headless allows LibreOffice to run in an environment without a GUI.
Changes to the Encryption Function
The function signature is generalized:
// Old: encryptPost(markdownBody, articleKey, passwords)
// New: encryptContent(contentBuffer, articleKey, passwords, contentType)
The output JSON now includes a type field, which the client-side uses to determine the rendering method:
{
"type": "pdf",
"content": { "iv": "...", "ciphertext": "..." },
"keys": [{ "salt": "...", "iv": "...", "data": "...", "tag": "..." }]
}
If type is missing, the client defaults to markdown for backward compatibility with older posts.
Client-Side Implementation
tryDecrypt() Returns Content Type
The decryption function needs to read the type field from the encrypted JSON and return it. Notably, since PDF data is binary, it cannot be decoded with TextDecoder—we return the ArrayBuffer directly:
const contentType = encData.type || 'markdown';
if (contentType === 'pdf') {
return { rawBuffer: decrypted, contentKeyB64, contentType };
}
return { plaintext: new TextDecoder().decode(decrypted), contentKeyB64, contentType };
renderDecryptedContent() Renders by Type
This new function is the client-side "dispatch center":
function renderDecryptedContent(result, article) {
const type = result.contentType || 'markdown';
if (type === 'pdf') {
const blob = new Blob([result.rawBuffer], { type: 'application/pdf' });
const url = URL.createObjectURL(blob);
article.innerHTML = `<iframe src="${url}" style="width:100%;height:80vh;
border:1px solid #ddd;border-radius:6px;" type="application/pdf"></iframe>`;
return null;
}
if (type === 'html') {
article.innerHTML = encNotebookCss + result.plaintext;
return result.plaintext;
}
// Default: markdown
article.innerHTML = marked.parse(result.plaintext);
return result.plaintext;
}
For PDFs, we use Blob + URL.createObjectURL to create a temporary, in-memory URL. The decrypted data never leaves the browser's memory, maintaining end-to-end security.
Session Cache Support for PDF Binaries
sessionStorage can only store strings, not an ArrayBuffer. For PDFs, we first encode the binary data as Base64 before storing it:
if (type === 'pdf') {
const bytes = new Uint8Array(result.rawBuffer);
let binary = '';
for (let i = 0; i < bytes.length; i++) binary += String.fromCharCode(bytes[i]);
sessionStorage.setItem(key, JSON.stringify({
content: btoa(binary), contentType: 'pdf', ts: Date.now()
}));
}
And decode it when reading from the cache:
if (cached.contentType === 'pdf') {
const binary = atob(cached.content);
const bytes = new Uint8Array(binary.length);
for (let i = 0; i < binary.length; i++) bytes[i] = binary.charCodeAt(i);
cachedResult.rawBuffer = bytes.buffer;
}
This way, users don't have to re-enter their password if they refresh the page within the same session.
Build-Time Results
$ node _tools/build-encrypted-posts.js
Encrypting: posts/gol/2026-02-25-cross-experiment-en (.md)
-> encrypted image: gol/.../cross_competition_dynamics.png.enc
-> blog/encrypted/gol/2026-02-25-cross-experiment-en.json [markdown] (1 key slots)
Encrypting: posts/2026/2026-02-26-encryption-test-latex (.tex)
-> blog/encrypted/2026/2026-02-26-encryption-test-latex.json [pdf] (1 key slots)
Encrypting: posts/gol/alife-paper-draft (.pdf)
-> blog/encrypted/gol/alife-paper-draft.json [pdf] (1 key slots)
Encrypted 16 post(s).
The LaTeX file is compiled to a PDF by pdflatex at build time, encrypted, and then rendered in an iframe on the client-side, preserving all formulas and formatting perfectly.
Summary
This update extends the encryption system from "Markdown-only" to "encrypt almost anything." The design principles are preprocess-dispatch, unify-encrypt, and render-dispatch:
- At build time, apply different preprocessing based on file type (leave text as-is, convert notebooks to HTML, convert others to PDF).
- The encryption core remains completely unchanged, with all types sharing the PBKDF2 + AES-256-GCM two-layer key system.
- On the client-side, select a rendering method based on the
typefield (marked.js / direct injection / iframe).
Not a single line of the underlying encryption logic was modified, so the security model remains as strong as before. The new build dependencies (pdflatex, LibreOffice) are only required if you use the corresponding file types—if you stick to Markdown and PDFs, you don't need to install anything new.