It’s no secret that Java is one of the most popular programming languages in the world right now. It was officially launched on May 23, 1995.
This article is about the basics and fundamental tools of Java, that will come in handy for beginner Java developers, while senior developers will be able to brush up on their knowledge.
Java Development Kit is the official development kit for building applications using the Java programming language. The Kit includes Java development tools and Java Runtime Environment.
Java development tools include around 40 different tools such as: javac (a compiler), java ( launcher for Java applications), javap (class file disassembler), jdb (java debugger), and others.
Java Runtime Environment is a set of software tools needed to launch a compiled Java program. It includes Java Virtual Machine and Java Class Library.
JVM is a program responsible for the execution of a bytecode. The first advantage of the JVM is a so-called ‘Write once, run anywhere’ rule. It means that an app developed on Java will run on any platform. This is an advantage of both JVM and Java itself.
There are many JVM implementations both commercial and open source. One of the reasons for creating new JVMs is to increase performance for a particular platform. Each JVM is developed for a platform separately. And there is an opportunity to develop the JVM so that it will work faster on a particular platform.
The most widespread JVM implementation is the JVM Hotspot by OpenJDK. There are also such implementations as IBM J9, Excelsior JET.
According to Java SE specifications, you need to complete the 3 following stages to get a code running on the JVM:
Classloaders are special classes that are a part of the JVM.
They load classes into Java memory and make them available for execution. The classloaders work with all classes: those that were written by us, and those that are directly needed for Java.
Imagine the situation, when you developed an app, and in addition to standard classes, there are lots of personalized classes. How will the JVM work? There is such a ‘function’ in Java, as lazy loading, which means that the class loading will be performed only when it is first referenced.
The first classloader is the Bootstrap. It is developed using C++
Bootstrap is a basic classloader that loads all system classes from rt.jar archive. At the same time, there is a slight difference between loading classes from rt.jar and our classes. When the JVM loads classes from rt.jar, it doesn’t perform all the verification stages, since the JVM is initially aware of the fact that classes from rt.jar are already verified. It means that you should not place any of your own files into that archive.
The next classloader is the Extension classloader. It loads Java extension classes from jre/lib/ext folder. Let’s say, you want a certain class to load every time JVM starts. In order to enable it, you can copy a source class file to that folder, and it will be loaded automatically.
Another classloader is the System classloader. It loads classes from the classpath we indicated when launching an application.
The process of classloading is performed by the following hierarchy:
If the class was not found in the Bootstrap cache, Bootstrap tries to load this class. If the loading failed, the Bootstrap delegates loading to the Extension classloader. If the class is then loaded, it stays in the cache of the Extension classloader, and the class loading is complete.
Here we proceed directly to the structure of class files.
One class written on Java is compiled into one file with the .class extension. If our Java file has several classes, it can be compiled into several files with the .class extension respectively, namely into bytecode files of these classes.
All numbers, strings, class references, fields, and methods are stored in the Сonstant pool — the Metaspace memory area. The class description is stored in the same area and contains a name, modificators, super-class, super-interfaces, fields, methods, and attributes. Attributes, in their turn, can contain any additional information.
Thus, when loading classes the following steps are performed:
First of all, JVM can interpret a bytecode in order to execute it. Bytecode Interpretation is rather a slow process. During this process, an interpreter “runs” through the class-file line by line, and translates it into commands understandable by the JVM.
The JVM can also translate (or compile) a bytecode into a machine code, that will be directly executed on the computer’s processor.
Commands that are executed frequently, won’t be interpreted, they will be translated automatically.
The Compiler is a programme that translates a source code from a high-level programming language to a machine code understandable for a computer.
Compilers are divided into:
Compilers can be also divided by the moment of compilation:
The Stack is a temporary memory space in Java. Stack memory is always referenced in LIFO order — “last in, first out”.
The Stack is needed to store methods. Variables in the stack are stored until the method they were created in will execute.
When a new method is invoked, a new frame of memory will be created on the top of the Stack. As soon as method ends, that block will be erased. The next method invoked will use that empty block. If the stack memory is full, Java will throw a xjava.lang.StackOverFlowError exception. For example, this can occur if there is a recursive function that invokes itself, and the Stack memory has overflowed.
Key Stack features:
Another memory space in Java is the Heap. It is used to store objects and classes. New objects are always created in the Heap, while references to these objects are stored in the Stack. Objects stored in the Heap are globally accessible, it means that they can be accessed throughout the application.
The Heap is divided into several parts called generations:
Why was the decision taken to get rid of Permanent generation? First of all, it was because of the error connected with memory overflow. Since the Perm had a constant size and could not expand dynamically, sooner or later the memory ran out, an error was thrown and application crashed.
In contrast, Metaspace has a dynamic size, and during its execution, it can expand up to the memory size of the JVM.
Key Heap features:
Based on the information above, let’s take a look at the process of memory management with a simple example:
public class App {
public static void main(String[] args) {
int id = 23;
String pName = "Jon";
Person p = null;
p = new Person(id, pName);
}
}
class Person {
int pid;
String name;
// constructors, getters/setters
}
Now, let’s analyze this step by step:
1. Upon entering the main() method, a space in stack memory would be created to store primitives and references of this method:
2. The call to the parameterized constructor Person(int, String) from
main()
will allocate further memory on top of the previous stack. This will store:id
in the stack memorypersonName
which will point to the actual string from the string pool in the heap memory3. This default constructor is further calling
setPersonName()
method, for which further allocation will take place in stack memory on top of the previous one. This will again store variables in the manner described above.This allocation is explained in the following diagram:
Garbage collector is a programme that works in JVM and is intended to delete objects that are no longer used or needed.
Different JVMs can have different algorithms of garbage collection, so there are a variety of different garbage collectors in Java.
Right now, we will talk about the simplest garbage collector — Serial GC. In order to request a garbage collecting process, we use the
System.gc()
command.As already mentioned, the Heap memory is divided into 2 sections called Generations: New generation and Old generation.
New generation includes 3 regions: Eden, Survivor 0, and Survivor 1.
Old generation includes the Tenured region.
So what happens when we create a new object in Java?
First of all this object goes to the Eden. If we have created a lot of objects and there is no more memory for new objects, the garbage collector is triggered and frees up memory: it cleans up the Eden region and relocates all surviving objects to the Survivor 0 region. Thus, the Eden is completely cleaned.
If the Eden was filled with new objects again, the garbage collector begins to work with both Eden region and Survivor 0 region. After the collection, the surviving objects will go to another region — Survivor 1, while the two other regions will remain empty.
During the following garbage collection, the Survivor 0 region will be chosen as a target region again. This is why it’s important that with one of the regions — Survivor 0 or Survivor 1 — is always empty.
JVM monitors the objects that are constantly copied and relocated from one region to another. So, in order to optimize this mechanism, after a certain threshold, the garbage collector moves such objects to the Tenured region.
When the Tenured region has overflowed, Mark-Sweep-Compact or full garbage collection happens.
During this mechanism, the Tenured region is cleaned from unused objects and is defragmented, i.e. sequentially filled with necessary objects.
In this article, I covered basic tools of Java: JVM, JRE, JDK, the main principles and stages of code execution on JVM, compilation, memory management, and garbage collection mechanism.
* The article is based on the report of Eugene Fraiman, IntexSoft Java Developer.