Day 5: Creating a Dynamic Blog with MDX and Migrating Legacy Content

Creating a Dynamic Blog with MDX and Migrating Legacy Content
📢 If you haven't read the previous parts, check out Day 1, Day 2, Day 3 and Day 4!
📚 You can also explore all posts from the 7 Days Website Revamp series!
Why We Chose MDX for Our Blog
We needed a blogging system that was:
- Developer-friendly
- SEO-optimized
- Flexible enough to embed custom components
- Easy to manage content without building a CMS
MDX (Markdown + JSX) was the perfect choice — allowing us to write content like Markdown, but embed React components when needed.
Setting Up MDX in Next.js
We followed the official Next.js MDX guide.
First, we installed the required packages:
npm install @next/mdx @mdx-js/loader
Then, we updated next.config.js to handle .mdx pages:
import withMDX from "@next/mdx";
const nextConfig = withMDX({
extension: /\.mdx?$/,
})({
// Your existing Next.js config
pageExtensions: ["js", "jsx", "ts", "tsx", "mdx"],
});
export default nextConfig;
Now, our app can render .mdx
files just like any other React page.
Organizing Blog Content
We’ve organized our blog content in the following directory:
/src/content/blog/
Each blog post is an .mdx file that includes:
- Frontmatter metadata at the top (enclosed within ---)
- The body content written in Markdown/JSX
Example frontmatter:
---
title: "Building a Dynamic Blog with MDX"
author: "Tiến Nguyễn Hữu Anh"
date: "May 3, 2025"
categories: ["Technology"]
featured: true
draft: false
image: "/images/blog/day-5-nextjs-mdx/day5-mdx.webp"
subtitle: "Using MDX to power our content system."
---
Parsing and Reading Blog Data
To manage our blog posts dynamically, we created a utility module that handles:
- Reading MDX files from the
/src/content/blog/
directory - Parsing frontmatter metadata (title, date, author, subtitle, etc.)
- Sorting posts by publication date
- Returning content and metadata for rendering
This allowed us to generate blog indexes, tag pages, and individual post pages easily at build time.
Utility functions include:
getBlogPosts()
→ Fetch all blog postsgetBlogPost(slug)
→ Fetch a single blog post by its sluggetBlogCategories()
→ List all categories and post countssortByPublicationDate()
→ Sort posts newest to oldest
Example Utility Functions
import fs from "fs";
import path from "path";
type Metadata = {
title: string;
subtitle: string;
image: string;
author: string;
date: string;
categories: string[];
featured: boolean;
draft: boolean;
};
export type MDXBlog = {
metadata: Metadata;
slug: string;
content: string;
};
function readMDXFile(filePath: string): MDXBlog {
const rawContent = fs.readFileSync(filePath, "utf-8");
const { metadata, content } = parseFrontmatter(filePath, rawContent);
const slug = path.basename(filePath, ".mdx");
return { metadata, slug, content };
}
function getMDXFiles(dir: string): string[] {
return fs.readdirSync(dir).filter((file) => file.endsWith(".mdx"));
}
function getMDXData(dir: string): MDXBlog[] {
const mdxFiles = getMDXFiles(dir);
return mdxFiles.map((file) => readMDXFile(path.join(dir, file)));
}
export function getBlogPosts(): MDXBlog[] {
const blogDir = path.join(process.cwd(), "src", "content", "blog");
return getMDXData(blogDir);
}
export function getBlogPost(slug: string): MDXBlog | null {
const blogDir = path.join(process.cwd(), "src", "content", "blog");
const filePath = path.join(blogDir, `${slug}.mdx`);
if (!fs.existsSync(filePath)) {
return null; // Return null if the file does not exist
}
return readMDXFile(filePath);
}
// ... and other utils functions
Example: Rendering Blog Posts
On the frontend, we used the parsed blog data like this:
import { getBlogPosts } from "@/lib/utils";
const blogs = getBlogPosts();
export default function BlogListPage() {
return (
<div>
{blogs.map((post) => (
<div key={post.slug}>
<h2>{post.metadata.title}</h2>
<p>{post.metadata.date}</p>
</div>
))}
</div>
);
}
Migrating Legacy Blog Posts
We had a lot of older blog posts on our previous Wix website (see why we switched from Wix to Next.js in our Day 1 article).
To streamline the process, we created a small Python crawler to automate the migration:
- Crawled the old site
- Extracted blog titles, dates, categories, and content
- Converted the HTML to Markdown using libraries like
html2text
- Saved each blog as a clean
.mdx
file in/src/content/blog/
This approach minimized manual errors and ensured our blog history was preserved accurately.
Example Python Crawler for Blog Migration
To automate the migration of our legacy blog posts, we built a lightweight crawler using Python, Selenium, and BeautifulSoup.
Below is a simplified structure of the crawler:
from selenium import webdriver
from bs4 import BeautifulSoup
from markdownify import markdownify
import requests
import os
def crawl_blog(url, output_dir):
driver = webdriver.Chrome()
driver.get(url)
html = driver.page_source
driver.quit()
soup = BeautifulSoup(html, "html.parser")
article = soup.find("article")
content = markdownify(str(article)) if article else ""
metadata = {
"title": soup.title.string if soup.title else "Untitled",
"author": "Unknown",
"date": "2025-01-01",
"categories": ["Uncategorized"]
}
mdx_content = f"""---
title: "{metadata['title']}"
author: "{metadata['author']}"
date: "{metadata['date']}"
categories: {metadata['categories']}
---
{content}
"""
os.makedirs(output_dir, exist_ok=True)
with open(os.path.join(output_dir, "post.mdx"), "w", encoding="utf-8") as f:
f.write(mdx_content)
crawl_blog("https://example.com/blog-post", "output")
Challenges and Lessons Learned
-
Frontmatter Consistency: Every blog post needed clean and accurate frontmatter to ensure proper rendering, SEO, and linking.
-
Handling Images: Old blog images were downloaded and referenced locally to maintain performance and eliminate dependency on external URLs.
-
Markdown Cleanup: Some artifacts from old HTML formatting needed manual fixing after automatic conversion.
-
SEO Considerations: Preserving original publication dates and metadata ensured better continuity for Google indexing.
Final Thoughts
Using Next.js + MDX combined with simple utilities allowed us to create a fast, scalable, and flexible blogging platform without the complexity of a traditional CMS.
Migrating our old content automatically also saved time, reduced errors, and kept the project moving forward efficiently.
In the next post, we'll explore how we built a dynamic Careers Page using Markdown and AWS Lightsail databases!
You can now continue with Day 6: Building a Careers Page with Markdown and Lightsail Database!
📚 You can also explore all posts from the 7 Days Website Revamp series!