paint-brush
How To Write Semantic HTMLby@jelduran
35,000 reads
35,000 reads

How To Write Semantic HTML

by John Elvis Durán MontoyaAugust 2nd, 2020
Read on Terminal Reader
Read this story w/o Javascript

Too Long; Didn't Read

Semantic HTML is a very important feature of HTML that is important to write good HTML. HTML was conceived as a semantic way to share content. Semantic content is almost always ignored by search engines. It is not a good thing for us to find a readable and understandable code. How to write semantic HTML is quite simple, follow the standards 4.0.0. The most important issue is that non-semantic HTML code is not being used in 3.0 but rather than HTML 40 is being more than semantic.

Companies Mentioned

Mention Thumbnail
Mention Thumbnail

Coin Mentioned

Mention Thumbnail
featured image - How To Write Semantic HTML
John Elvis Durán Montoya HackerNoon profile picture

As a Microverse Student in the last time, I have been struggling with some HTML and CSS projects. They challenged me to learn new features of these technologies all the time. My projects were made gradually trying to follow good practices but none was as striking for me as the use of semantic HTML.

At first sight, this seems not to be something to worry about, but if you look at the whole picture you will realize that it is a very important feature of HTML. That is why I want to show you how important it is to take this into account when it comes to writing good HTML.

One <div> to tag them all...

First of all, let's make something clear, semantics in HTML is nothing new. HTML was conceived as a semantic way to share content. Having said that, let's talk a little about the web's past. Some time ago, the internet and the web were nothing more than nerd things, a weird media for sharing scientific and freaky content. But that changed over time, many other segments of society realized the huge potential of this technology and started to use it.

This event changed the web technology forever. The web was no longer a matter of content, it also required a presentation layer. By that time, HTML 4.0 was released and brought with it a lot of new tags, many tags for many semantic uses. But as always, there were among those two apples of discord, <span> and <div>.

These two tags were conceived to label any content that could not be labelled by any other semantic tag. They were just for use in special cases. "Unfortunately", webmasters and graphic designers (the first web developers) did not think the same way, for them this feature combined with also recently released CSS was a golden opportunity to create stunning and commercial web content.

They abused <div> and <span> tags. Almost every document's body on the web was a bunch of <div> tags. The web quickly turned into a place, plenty of beautiful websites but little semantic HTML documents.

<html>
<head>
  <title>Non-semantic HTML</title>
</head>
<body>
  <div id="container">
    <div id="header">
      <div id="header-content"></div>
    </div>
    <div id="section">
      Section content
    </div>
    <div id="footer"></div>
  </div>
</body>
</html>

Well, what is so wrong with that?

Maybe it is not so evident at first glance, it is not a problem. But when we create documents for the web it is very important to separate content from presentation. Therefore it is also very important that the content is correctly structured. So, what does correctly structured mean? Well, it means that each element of the content must be able to identify itself within the document, no matter the media through which is going to be presented.

For example, a header must be labelled by a tag indicating that it is a header, no matter if it is going to be read, displayed or analyzed. Think about if this is possible with a document full of generic tags like <div> or <span>. Let us imagine a reading machine for visually impaired people. What if instead of using a semantically correct tag for a paragraph like <p> we use a <div> instead? The reading machine will never understand that it must make a pause before and after the paragraph reading. Therefore, the user experience for that visually impaired person will be deficient. On the other hand, what if we want our page to rank well on search engines?

So, the bad news is that non-semantic content is almost always ignored by search engines. And finally, as a developer, it is not a good thing for us to find a readable and understandable code? Imagine getting a big non-semantic HTML code and have to guess the meaning of each one of its elements. Well, these are the troubles with non-semantic HTML, but hey! I am forgetting a small issue perhaps the most important issue!

We are living in Web 3.0 nowadays! And, do you know what it is called? Yes! It is called The Semantic Web. So, do you still think semantic HTML code is not important?

...but then, how to write semantic HTML?

The answer is quite simple, follow the standards. HTML 4.0 is a great markup language, so much so it was for 17 years before HTML5 shows up. But if there is something that HTML5 makes better than HTML 4.0 is being more semantic. This does not mean that the 4.0 version was not semantic but rather than HTML5 filled most of the semantic gaps that it had. HTML5 brought with it a new set of semantic tags some of them especially thought to avoid unnecessary use of <div> and <span>.

The World Wide Web Consortium recommends labeling every element according to its content. For example, if you are going to use <div class=header> for grouping the top section of the document, why not to use semantic tag <header> instead. It is much more readable, accessible and understandable. So if you want to write more semantic HTML5, reach for the official documentation and check out the more than 100 available semantic elements.

Meanwhile, I will describe some of the semantic elements I learnt in my HTML course with Microverse.

Don't divide, better create sections.

In HTML 4.0 it was very common to create a lot of divs according to our graphic layout, not to our content. The result was a confusing bunch of <div> tags with no semantic relevance. Since HTML5 we have the <section> tag, which is very useful to split our main content into smaller groups of content.

The <section> elements can be nested and it is important to point out that should always have a heading element to be well-formed.

<html>
<head>
  <title>Document</title>
</head>
<body>
  <section id="container">
    <section id="header">
      <section id="header-content"></section>
    </section>
    <section id="section">
      Section content
    </section>
    <section id="footer"></section>
  </section>
</body>
</html>

Main content?

Yes, the main content. What is most important when it comes to documents? The navbar? A slider? Social icons? Forms? No! The most important thing for a document is the information itself, trying to be preserved and transmitted. When we refer to "main content" we are talking about information, everything else is completely superfluous. That can be removed or edited at will, without affecting information's integrity. That is why identifying the "main content" is to find the criteria for building a more semantic HTML document.

Self-contained

If you find in your document some section that perfectly works as a complete piece of information and can be replaced without affecting the integrity of your document, you have got an <article>. Posts and news are excellent <article> examples. You can generate and exchange them at will but it will not affect document structure.

<html>
<head>
  <title>Document</title>
</head>
<body>
  <section>
    <section id="header">
      <section id="header-content"></section>
    </section>
    <article>
      <h1>Article heading</h1>
      <p>article content</p>
    </article>
    <section id="footer"></section>
  </section>
</body>
</html>

<section> and <article> are workmates, not relatives.

It is common at first when you try to use semantic HTML to struggle with the decision about how to nest <section> and <article>. Who is the father? Who is the child? Short answer, none of them. These elements were not conceived to be part of a hierarchy, in fact, they are made for working together.

When it is about creating semantic structures you can use them one inside another without any problem.

<html>
<head>
  <title>Document</title>
</head>
<body>
  <section>
    <section id="header">
      <section id="header-content"></section>
    </section>
    <article>
      <h1>Article heading</h1>
      <section>
        <h2>Content heading</h2>
        <p>article content</p>
      </section>
    </article>
    <section id="footer"></section>
  </section>
</body>
</html>

Formally introducing content

It is normal to have sections that introduce the next content, maybe through a set of headings or an image. For these cases, we can use <header> for grouping all the elements that pursue that purpose. You may have several headers in your document but there is only one condition, never use them inside another <header>, <footer> or <address> element.

<html>
<head>
  <title>Document</title>
</head>
<body>
  <header>
    <section id="header-content"></section>
  </header>
  <section>
    <article>
      <h1>Article heading</h1>
      <section>
        <h2>Content heading</h2>
        <p>article content</p>
      </section>
    </article>
    <section id="footer"></section>
  </section>
</body>
</html>

Formally closing content

If there is a way for opening it should be a way of closing! Yes! You are right! If you want to tag relevant elements at the end of a section, you can use <footer>. Credits, copyrights, sitemaps, secondary navbars, etc. All these types of elements can be grouped inside a <footer> tag.

<html>
<head>
  <title>Document</title>
</head>
<body>
  <header>
    <section id="header-content"></section>
  </header>
  <section>
    <article>
      <h1>Article heading</h1>
      <section>
        <h2>Content heading</h2>
        <p>article content</p>
      </section>
    </article>
  </section>
  <footer>
      
  </footer>
</body>
</html>

What about the controls?

It is well known that one of the advantages HTML have is to allow navigation among documents. That is why we are always grouping links into navbars along with other elements like icons and images. There is also a semantic tag for this purpose, the <nav> element. No matter what technique you use for your navbars, put it all into the <nav> element.

This element can be inside of any other block element into the HTML document, but please use it wisely.

<html>
<head>
  <title>Document</title>
</head>
<body>
  <header>
    <nav>
      <ul>
        <li><a href="#">Link 1</a></li>
        <li><a href="#">Link 2</a></li>
        <li><a href="#">Link 3</a></li>
      </ul>
    </nav>      
  </header>
  <section>
    <article>
      <h1>Article heading</h1>
      <section>
        <h2>Content heading</h2>
        <p>article content</p>
      </section>
    </article>
  </section>
  <footer>

  </footer>
</body>
</html>

There can be only one.

Despite having identified the main content it is possible that we still need to label a section of the document as the main section. This is easily solved by using the <main> element. The <main> element is useful for labeling unique content into the document. But as you can read on the title, there can be only one, one document, one <main> element. This semantic element is especially useful for search engine optimization.

When web bots get to your page, the <main> element will be yelling READ ME! So don't overlook it.

<html>
<head>
  <title>Document</title>
</head>
<body>
  <header>
    <nav>
      <ul>
        <li><a href="#">Link 1</a></li>
        <li><a href="#">Link 2</a></li>
        <li><a href="#">Link 3</a></li>
      </ul>
    </nav>      
  </header>
  <main>
    <article>
      <h1>Article heading</h1>
      <section>
        <h2>Content heading</h2>
        <p>article content</p>
      </section>
    </article>
  </main>
  <section>
    <h1>Section heading</h1>
    <p>Section content</p>
  </section>
  <footer>

  </footer>
</body>
</html>

If it is not main, put it aside!

We talked about semantic elements which are very useful. Elements like <section>, <article>, <main>, <header>, <footer> and <nav> are great tools to write semantic HTML code. But, what about that content related but not a part of the main content? Well, we have a special tag for those cases: <aside>.

Use it when you need to label extra content, for example, newsfeeds, commercial offers, a newsletter form, etc.

<html>
<head>
  <title>Document</title>
</head>
<body>
  <header>
    <nav>
      <ul>
        <li><a href="#">Link 1</a></li>
        <li><a href="#">Link 2</a></li>
        <li><a href="#">Link 3</a></li>
      </ul>
    </nav>      
  </header>
  <main>
    <article>
      <h1>Article heading</h1>
      <section>
        <h2>Content heading</h2>
        <p>article content</p>
      </section>
    </article>
  </main>
  <section>
    <h1>Section heading</h1>
    <p>Section content</p>
  </section>
  <aside>
    Auxiliary content
  </aside>
  <footer>

  </footer>
</body>
</html>

Just figure it!

And last but not least, we have a useful element for labeling all that content that is in the main flow but can be dismissed when necessary. This is somehow similar to <article> because it is a self-contained element that can be placed or removed at will. We are talking about <figure>. This tag is especially useful when it comes to group auxiliary content.

Along with its sidekick <figcaption>, <figure> is excellent for marking content like illustrations, charts, diagrams, photos, etc.

<html>
<head>
  <title>Document</title>
</head>
<body>
  <header>
    <nav>
      <ul>
        <li><a href="#">Link 1</a></li>
        <li><a href="#">Link 2</a></li>
        <li><a href="#">Link 3</a></li>
      </ul>
    </nav>      
  </header>
  <main>
    <article>
      <h1>Article heading</h1>
      <section>
        <h2>Content heading</h2>
        <p>article content</p>
      </section>
    </article>
  </main>
  <section>
    <h1>Section heading</h1>
    <p>Section content</p>
  </section>
  <aside>
    Auxiliary content
  </aside>
  <footer>
    <figure>
      <img src="logo.png" alt="">
      <figcaption>Slogan</figcaption>
    </figure>
  </footer>
</body>
</html>

Finally...

One last tip, remember that same problem with <div> may occur with any other element, don't overuse tags! No matter if you only use semantic HTML elements, if you abuse, your document will not be a well-formed and semantic document.

As a conclusion, I will say that it is important to be aware of the importance of semantics for web development. Tomorrow's web success depends to some extent on how accessible, readable and analyzable information is. We have to stop thinking of HTML as a language for making web pages. Instead, let us use HTML as a powerful tool for distributing semantic content.