A brief primer on web apps for ITM students (part 1)

2024-01-07

Welcome back to my advice series. This piece is the first article in a series for anyone in any of my ITM classes that has the Digital Cafe as one of their assignments. The intent of this series (and of the Digital Cafe assignment) is to help you be able to prototype your own web apps.

Alone, this piece might not be enough to complete the assignment. Web development is difficult. Don't let anyone tell you otherwise. I hope I can at least share my understanding of how web apps work and how I think you should be thinking about them.

What websites are

At some point this semester, you will complete an "Intro to HTML" assignment. To those of you who have already gone through that assignment, I ask you to reflect on what you built.

All we asked you to do in that assignment was write a set of plain text files. The information in each of those files represented a single "page," a virtual location that stored some sort of content that your browser knows how to render. Critically, each page also contained references to other pages; you could traverse every page in your assignment folder by just clicking on links that each page stored to another.

It's just content and links to other content.

That's all a website has to be. That's how most websites were when the Web first appeared. The only real difference is that the plain text files were served over the internet. They didn't all live on a single computer like your assignment's files do. Whereas you can reference a locally-stored page with its file path, you have to use an IP address or a domain name to reference a page stored remotely. However, you can reference other pages -- and thus link to them -- all the same.

I bet some of you have never seen a website like this before. Here is the personal blog of Paul Graham, one of the founders of the Y Combinator startup accelerator. Its only job is to host Paul Graham's essays.

Again, it's just content and links.

This very site, joeilagan.com, is spiritually one of these old websites. I could have easily just written all of these articles as plain HTML and hosted them on a web server.

Of course, this isn't how most (prominent) websites are today. Dan Olson, my favorite video essayist, describes the current web as such:

The current state of the web, concentrated in a few mega platforms, is the result of compounding complexity.

We used to have a web where anyone could learn to write a webpage in HTML in an afternoon. It’s just writing text and then using tags to format the text.

But over time people, understandably, wanted the web to do more, to look better, and so the things that were possible expanded via scripting languages that allowed for dynamic, interactive content.

Soon the definition of what a “website” was and looked like sailed out of reach of casual users, and eventually even out of reach of all but the most dedicated hobbyists. It became the domain of specialists.

State

Web development starts becoming complicated when we introduce state into our websites.

Think again about your "Intro to HTML" assignment. As it is, it is only fit to be a "brochure" website for a company: a website that simply displays information. There's nothing wrong with that, but what happens when you want to do more?

Imagine now that your client wants to update the site with a new product. You would need to do at least the following:

This isn't difficult if your website only has five or six products. Imagine now, though, that you have hundreds of products and that new products are added and removed multiple times every day, maybe even every hour. The solution of writing new pages and updating existing pages by hand becomes untenable.

The answer the industry came up with to make this simpler is to represent content in a more compact form, first. Your "Intro to HTML" website represents its content directly in HTML. To manage a high volume of content, it would be simpler to represent your products as rows in a spreadsheet first, then find a way to translate that spreadsheet into HTML. This way, you can add products to your website simply by adding rows to the products table in the spreadsheet.

Of course, replace "spreadsheet" here with "database." If you don't come from a programming background, think of a database as a very programmatic spreadsheet. Instead of interacting with your tables through a point-and-click interface, you interact with them through issuing commands. The core concept of storing data as rows in tables remains the same.

Merely adding a database won't quite let you write a Facebook or a YouTube, but it opens up a lot of possibilities. With a database, your website can be dynamic. It can change without you having to edit its source code.

Hacker News, a site where users can post "anything that good hackers would find interesting," demonstrates this spirit well. By modern web standards, this is not a pretty website, but a lot of programming types like it because it simply does its job. It lets users (oh, we'll get to users eventually!) upload "items" on which other users can leave comments. All these items and comments are stored in the site's database. When you want to view an item, it retrieves the item and its comments from its database and renders them as HTML.

You can write a dynamic site without ever touching a database: just store your data in a file. This can achieve the same dynamism as long as you don't go beyond a certain scale. The industry generally agrees that this is a bad idea, though, so we will be assuming that you're using a proper database going forward.

Model, View, Controller

This new requirement of dynamism alone adds a lot of complexity to a website.

Put the database aside for now. This very site doesn't use one. What you really care about is state.

I write these articles as Markdown files. When you visit my site's home page, a Go program reads the files in my content/ folder and constructs my home page based on what it finds. When you click on an article, the path /article/{{article_file_name}} is appended to the URL in the browser, which tells the Go program to instead read the specific file and construct the HTML to render it.

If you write your own dynamic site, the process will be similar.

These four core concerns are so common in web apps that a whole class of heavy-duty libraries called web frameworks have emerged in nearly every language to handle them. You will often hear each of the elements above referred to as routing, models, controllers, and views respectively.

A framework that handles all of the concerns is usually called a "Model-View-Controller" (MVC) framework. There also exist "microframeworks" that only handle a subset of these concerns, perhaps only routing and controllers. For Python, the MVC framework is called Django, (though they prefer to call themselves a "Model-Template-View" framework), and the microframework is called Flask. There are others, but these are the most famous ones.

Web apps are difficult

The only reason anything gets done at all in web development nowadays is because some of the frameworks are pretty good at abstracting away the true complexity of the field. The frameworks aren't perfect abstractions, but believe me -- without them, you wouldn't even be able to start.

In the next article, I'll discuss Flask. My intent is to show you the basics of routing and controllers, which Flask is perfect for demonstrating.

Navigation

Next article