Multithreading is something that you thought you know, but actually you know nothing about it.
In short, multithreading(concurrency) means running several programs at the same time with a common critical section. Since accessing the data in different orders would yield different results, multithreaded programs are unlikely to generate the same results at the same time.
Noticing the difference between concurrency and parallelism. Concurrency is operating only on a single core, so the multithreading is merely creating an “illusion” of running at the same time, whereas parallelism is literally running on different cores at the same time(may not have race conditions). Concurrency normally has race conditions that may output the wrong result. Below is an example:
Consider the following two programs in Java:
1 balance = 16,000;
2 //Program 1:
3 public void main() {
4 balance = balance + 1,000;
5 balance = balance - 4,000;
6 }
7 \\Program 2:
8 public void main() {
9 balance = balance - 2,000;
10 balance = balance + 6,000;
11 }
Assuming they are referring to the same balance
variable of type int, if we run the first program then run the second, we will get the same result in every execution. However, when we try to run it concurrently, things get tricky: the first program might increase the balance by 1000 based on the original balance 16000, while the second program tries to decrease the balance by 2000 based on the original value 16000 as well! In this case, only one of line 4 or line 9 gets executed. We lost one line of code!
Critical section means the same resources that have been accessed by multiple threads. In this example, the balance is the critical section.
In general, there are several methods to avoid race conditions.
- Using a lock to lock the critical section every time a thread is accessing it(could be coarse-grained: locking the entire list when reading/writing only few elements in it, or fine-grained: only locking the elements that are being accessed)
- Optimistic approach: check whether the critical section is altered after the execution of a thread: if it’s indeed altered by another thread, roll back the operation and retry.
- Using atomic operations along the way: atomic means that during the execution of a line of code, the critical section won’t be changed by another thread. Using atomicity can potentially create lock-free programs that still handle race conditions well
Since there are multiple threads in a program, there is a need to create certain rules on which thread to go first because we do not want to starve any threads. Such rules/regulations are called schedulers that will determine the order of execution based on several things: wait time of a thread, execution time of a thread, the id of a thread, etc. There are different kinds of schedulers, such as Round Robin scheduler, Shortest Remaining Time scheduler, First-come-first-serve scheduler, etc.
In Java, multithreading is achieved using the library java.util.concurrency. This library has many useful tools, such as different types of locks, sleep(), start(), and join(), as well as the creation of a thread, etc. Here, I’m just trying to show several of multithreading utilities:
The most basic utility is the Thread
class:
//Any class can extend Thread to become a thread whose executable code is inside run()
public class Thread {
//The execution should be written in this method
public void run();
//Stop this thread until its parent thread is completed
public void join();
//Used to start a thread
public void start();
//Allows a thread to pause for a certain amount of time
public void sleep(long milisecond);
}
There are plenty of demo codes on Google that you can reference to, so it is recommended to try an example and learn more than the concept.
There is another way of creating a thread: the Runnable
interface.
\\Any class can implement Runnable and fill in run()
public interface Runnable {
public void run();
}
More importantly, there are several types of locks that you can use. Notice the most basic way of locking stuff up is to use the keyword synchronize
. There are several ways of using it:
//When invoking this method, the entire object that this method belongs to will be locked
public synchronized some_method(int some_argument);
//When executing the code, the thing inside the parenthesis will be locked
synchronized (some_object) {
some_code...
}
//When executing the code, the object itself will be locked. This is equivalent as the first one.
synchronized (this) {
some_code...
}
Now, we introduce the locks. The most basic and simplest one is ReentrantLock: when calling lock.lock(), if another thread has already possessed a lock, this thread would pause until the completion of that thread. It would be used as something like this:
ReentrantLock lock = new ReentrantLock();
int count = 0;
void increment() {
lock.lock();
try {
count++;
} finally {
lock.unlock();
}
}
There is another type of lock called ReadWriteLock. The idea is that it is always safe to let several thread to read something, so merely reading the critical section does not need any locking. An example would be like this:
ExecutorService executor = Executors.newFixedThreadPool(2);
Map<String, String> map = new HashMap<>();
ReadWriteLock lock = new ReentrantReadWriteLock();
executor.submit(() -> {
lock.writeLock().lock();
try {
sleep(1);
map.put("foo", "bar");
} finally {
lock.writeLock().unlock();
}
});
Runnable readTask = () -> {
lock.readLock().lock();
try {
System.out.println(map.get("foo"));
sleep(1);
} finally {
lock.readLock().unlock();
}
};
executor.submit(readTask);
executor.submit(readTask);
stop(executor);
As you can see, reading would only involves the read lock that does not block reading requests from other threads, whereas writing should involve the write lock. This can save some overhead costs of locking and unlocking, but might lead to more errors as you have to be careful with which lock to use: you cannot mix read and write locks!
Now, we introduce another very tricky lock: StampedLock. Upon locking, this type of lock will return a value of long type, and later we can use this value to release the lock. Notice that unlike the ReentrantLock, each type we call the lock() method, a new value of long will be generated and the old lock will become invalid. Thus, we have to be careful when we try to add locks. An example usage is shown below:
ExecutorService executor = Executors.newFixedThreadPool(2);
Map<String, String> map = new HashMap<>();
StampedLock lock = new StampedLock();
executor.submit(() -> {
long stamp = lock.writeLock();
try {
sleep(1);
map.put("foo", "bar");
} finally {
lock.unlockWrite(stamp);
}
});
Runnable readTask = () -> {
long stamp = lock.readLock();
try {
System.out.println(map.get("foo"));
sleep(1);
} finally {
lock.unlockRead(stamp);
}
};
executor.submit(readTask);
executor.submit(readTask);
stop(executor);
There is another concept called semaphore. Semaphore is both a concept and a utility in the java.util.concurrency package. We will introduce the concept first: semaphore is merely a control signal used among threads. It could be a non-negative value indicating, for instance, should a thread wait or execute. In Java, the semaphore utility is slightly different, but the idea is the same.
There are many problems that need to be solved in the realm of concurrency, such as concurrency regarding red-black trees. Many underlying Java classes, such as ConcurrentLinkedList, ConcurrentLinkedQueue and ConcurrentHashMap, are realized concurrently. There is also atomic utilities, such as AtomicInteger which is commonly used as a counter among different threads, and AtomicReference which is used in lock-free implementation of concurrent programs.
Lock-free is a tremendous step to take because it’s complexity, but once you get how an optimistic lock works, you can grasp the core idea of how to realize lock-free programs. However, each locking method has it’s own drawback: obviously coarse-locking will be painfully slow when the amount of critical section is huge; however it is the easiest to implement. Optimistic locking might be promising because it uses less lock and the average waiting time of a thread would decrease; however, it can cause too many retries, and people have to use atomic variables(such as AtomicInteger class) and make some field volatile
in order to realize it. Even lock-free has its disadvantage. Which lock to choose is based on what you are achieving, not about showing how complicate things could be.