OpenMP

OpenMP API ื”ื•ื ืื•ืกืฃ ื›ืœื™ื ืฉืžืืคืฉืจื™ื ืชื›ื ื•ืช ืžืงื‘ื™ืœื™ ื‘ืฉืคืช c ื• cpp ื•ืฉืคื•ืช ื ื•ืกืคื•ืช.
pthreads ื”ื•ื ืขื“ื™ื™ืŸ ื”ื›ืœื™ ื”ืขื“ื™ืฃ ืžื‘ื—ื™ื ืช ื‘ื™ืฆื•ืขื™ื ื•ื’ืžื™ืฉื•ืช. ืขื ื–ืืช, ืžื‘ื—ื™ื ืช ืจืžืช ื”ืื‘ืกื˜ืจืงืฆื™ื™ื” ื•ื”ื ื•ื—ื•ืช ืœืžืชื›ื ืช ืขื ื‘ื™ืฆื•ืขื™ื ื˜ื•ื‘ื™ื ืขื‘ื•ืจ ื”ืจื‘ื” ืชื‘ื ื™ื•ืช ืขื™ืฆื•ื‘ ืžืงื‘ื™ืœื™ื•ืช. ื‘ื›ืœ ืžืงืจื”, ื ื™ืชืŸ ืœื›ืชื•ื‘ ืงื•ื“ ืฉืžืฉืœื‘ ื‘ื™ืŸ ื”ืฉื ื™ื™ื.

OpenMP ืžืชืœื‘ืฉืช ืขืœ ื›ืžื” ืฉื›ื‘ื•ืช ื‘ software stack, ืžืจื›ื™ื‘ื™ื ื‘ืกื™ืกื™ื™ื ืฉืœ OpenMP ื”ื compiler directives, OpenMP library ื•- environment variables.

Pasted image 20240222150516.png|350

ื ืกืชื›ืœ ืขืœ ืงื•ื“ hello world ืœื“ื•ื’ืžื” ื‘ OpenMP:

#include <omp.h>
#include <stdio.h>

int main() {
	#pragma omp parallel
		{
			printf(" hello ");
			printf(" world\n ");
		}
	return 0;
}

ื”ืฉื•ืจื” pragma omp parallel ื”ื™ื ื”ื ื—ื™ื•ืช ืœืงื•ืžืคื™ื™ืœืจ.

ื”ื‘ืœื•ืง ื‘ืคื ื™ื ื—ื™ื™ื‘ ืœื”ื™ื•ืช ืขื ืฉื•ืจืช ื›ื ื™ืกื” ืื—ืช ื•ืฉื•ืจืช ื™ืฆื™ืื” ืื—ืช. ื”ื‘ืœื•ืง ืฉื‘ืคื ื™ื ื ืงืจื parallel region.

ื ื‘ื—ืŸ ืืช ืคืœื˜ ื”ืชื•ื›ื ื™ืช:
Pasted image 20240222151243.png|200ืจืืฉื™ืช, ื ืฉื™ื ืœื‘ ืฉืงื™ืžืคืœื ื• ืขื ื”ื“ื’ืœ fopenmp.
ื–ื” ื“ื’ืœ ืฉืื•ืžืจ ืœgcc ืœื”ืชื™ื™ื—ืก ืœdirectives ืฉื”ื’ื“ืจื ื• ื‘ืงื•ื“ (ื‘ืžืงืจื” ื”ื–ื” ื”ืฉื•ืจื” pragma omp parallel).

ื ืชื ื• ื”ื•ืจืื” ืœ- gcc ืœื”ืจื™ืฅ ืืช ื”ืงื•ื“ ืฉื‘- block ื‘ืžืงื‘ื™ืœ ืขื ืžืกืคืจ default ืฉืœ threads, ืฉืฉื•ื•ื” (ื‘ื“ืจืš ื›ืœืœ) ืœืžืกืคืจ cores ืฉื™ืฉ ื‘ืžื—ืฉื‘ ื›ืคื•ืœ ืžืกืคืจ ื” Hardware threads ืฉื™ืฉ ื‘ื›ืœ core. ืœื›ืŸ ื”ืงื•ื“ ื‘ืžืงืจื” ื”ื–ื”, ืžืชื‘ืฆืข ื‘ืžืงื‘ื™ืœ ืข"ื™ 4 threads ื•ืื ื—ื ื• ืจื•ืื™ื 4 ื”ื“ืคืกื•ืช ืฉืœ hello world.

CPU Info

ื›ื“ื™ ืœืงื‘ืœ ืžื™ื“ืข ืžืคื•ืจื˜ ืขืœ ืืจื›ื™ื˜ืงื˜ื•ืจืช ื”ืžื—ืฉื‘ ืฉื‘ืจืฉื•ืชื ื• ื ื•ื›ืœ ืœื”ืฉืชืžืฉ ื‘ืคืงื•ื“ืช lscpu. ืื ื ืจืฆื” ืจืง ืืช ืžืกืคืจ ื”ืœื™ื‘ื•ืช ื ืฉืชืžืฉ ื‘ืคืงื•ื“ืช nproc
Pasted image 20240222172229.png

num of threads:
ื›ื“ื™ ืœื”ื’ื“ื™ืจ ื›ืžื” threads ื ืจืฆื” ื›ื“ื™ ืœื”ืจื™ืฅ ืงื˜ืข ืงื•ื“ ืžืงื‘ื™ืœื™ ื ืฉืชืžืฉ ื‘ืคืงื•ื“ื” omp_set_num_threads(n) ืœืคื ื™ ืคืชื™ื—ืช ื”ื‘ืœื•ืง.

ื›ื“ื™ ืœืงื‘ืœ ืืช ื”ืžื–ื”ื” ืฉืœ ื›ืœ thread ื ืจื™ืฅ ืืช ื”ืคืงื•ื“ื” omp_get_thread_num .

ื‘ืื•ืคืŸ ืกื›ืžืชื™ ืืคืฉืจ ืœืชืืจ ืชื•ื›ื ื™ืช ื”ืžืฉืชืžืฉืช ื‘ pragma ื‘ืื•ืคืŸ ื”ื‘ื-
Pasted image 20240308141508.png

Fork-Join parallelism:
ืงื˜ืขื™ื ืžืกื•ื™ืžื™ื ืžื”ืงื•ื“ ืžืžื•ืงื‘ืœื™ื, ืงื˜ืข ืงื•ื“ ืžืงื‘ื™ืœื™ ื”ื‘ื ืžืชื—ื™ืœ ืจืง ื›ืืฉืจ ืงื˜ืข ืงื•ื“ ืžืงื‘ื™ืœื™ ื”ืงื•ื“ื ืœื• ืžืกืชื™ื™ื. ื”ื”ื ื—ื” ืฉืœ ืฉื™ื˜ื” ื–ื• ื”ื™ื ืฉื”ืชื•ื›ื ื™ืช ืžืชื—ื™ืœื” ืžthread ื™ื—ื™ื“ ืฉื ืงืจื master thread ืื• main ื•ืžืžื ื• ื™ื•ืฆืื™ื ื›ืœ ื”ืงื˜ืขื™ื ื”ืžืงื‘ื™ืœื™ื™ื.

Pasted image 20240308220651.png

ื—ื™ืฉื•ื‘ Euler Mascheroni constant

ื ืจืื” ืื™ืš ืžื—ืฉื‘ื™ื ื‘ืื•ืคืŸ ืกื“ืจืชื™ ื•ืขื OMP ืืช ื”ื ื•ืกื—ื” ืฉืœ ืงื‘ื•ืข ืื•ื™ืœืจ

limnโ†’โˆžlogโก(n)โˆ’โˆ‘i=1n1iโ‰ˆโˆ’0.577216

ื”ื—ื™ืฉื•ื‘ ื‘ืฆื•ืจื” ืกื“ืจืชื™ืช ื”ื•ื ืคืฉื•ื˜

double seq_euler_mascheroni() {
	double sum = 0;

	for( int i = 1; i < N; i ++){
		sum+= 1.0/i;
	}

	return log(N) - sum;
}

ื›ืขืช ื ืฉื•ื•ื” ืœื’ืจืกื” ื”ืžืงื‘ื™ืœื™ืช

double sum = 0;

#pragma omp_parallel 
{
	const int num_threads = omp_get_num_threads();

	double thread_sum = 0;
	int tid = omp_get_thread_num + 1;

	for(int i= tid; i < N; i+= num_threads) {
		threads_sum += 1.0/i;
	}

	#pragma omp critical
		sum += thread_sum
}

ืื ื ื‘ืฆืข benchmark ื‘ืืžืฆืขื•ืช ื”ืคื•ื ืงืฆื™ื” omp_get_wtime ื ืงื‘ืœ ื–ืžืŸ ืจื™ืฆื” ืฉืžืฉืžืขื•ืชื™ืช ื™ื•ืชืจ ืžื”ื™ืจ ื‘ืืœื’ื•ืจื™ืชื ื”ืžืงื‘ื™ืœื™

Pasted image 20240308150356.png

False sharing

ื ืกืชื›ืœ ืขืœ ืงื•ื“ ืกื“ืจืชื™ ื•ืžืงื‘ื™ืœื™ ืœืกื›ื™ืžืช ืžืขืจืš

Pasted image 20240308151108.png|350
Pasted image 20240308151057.png|350

ื”ืืœื’ื•ืจื™ืชื ื”ืžืงื‘ื™ืœื™ ืžื—ืฉื‘ ืืช ื”ืกื›ื•ืžื™ื ื”ื—ืœืงื™ื™ื ื•ืœื‘ืกื•ืฃ ืžื—ื‘ืจ ืืช ื”ื›ืœ ื‘ื™ื—ื“. ื ืฉื™ื ืœื‘ ืฉื”ื•ื ืžืฉืชืžืฉ ืคื” ื‘ืคืงื•ื“ืช omp single ืฉื–ืืช ืคืงื•ื“ื” ืฉืžืืคืฉืจืช ืจืง ืœthread ืื—ื“ ืœื”ื›ื ืก ืืœื™ื• ื•ืœืื—ืจ ืžื›ืŸ ืœื ืžื›ื ื™ืกื” ื™ื•ืชืจ threads ืœืงื˜ืข ื”ืงื•ื“ ื”ื–ื” (ืื ืœื ื”ื™ื™ื ื• ืžืฉืชืžืฉื™ื ื‘ OpenMP ืœื›ืชื•ื‘ ื“ื‘ืจ ื›ื–ื” ื”ื™ื” ื“ื•ืจืฉ ืžื ื’ื ื•ืŸ ืžืฉืœื ื•). ื”ืฉื™ืžื•ืฉ ื‘ืคืงื•ื“ื” ื–ืืช ื”ื™ื ื˜ื•ื‘ื” ื›ืืฉืจ ื ืจืฆื” ืœื”ื–ื™ืŸ ืคืจืžื˜ืจ ืจืง ืคืขื ืื—ืช ื‘ืžืงืจื” ื”ื–ื” thread_num ื”ื•ื ื”ืคืจืžื˜ืจ ื”ื–ื”.

ืื ื ื‘ื“ื•ืง ืืช ืชื•ืฆืื•ืช ื”ื”ืจืฆืื” ืขื‘ื•ืจ ืžืขืจื›ื™ื ื’ื“ื•ืœื™ื ื‘ืžื™ื•ื—ื“ ื ืงื‘ืœ
Pasted image 20240308151300.png
ื›ืœื•ืžืจ ืื ื—ื ื• ืจื•ืื™ื ืฉื–ืžื ื™ ื”ื”ืจืฆื” ื”ืžืงื‘ื™ืœื™ื™ื ื’ืจื•ืขื™ื ื™ื•ืชืจ ืžื”ืกื“ืจืชื™ื™ื.

ื–ื” ื™ื›ื•ืœ ืœื ื‘ื•ืข ืžื›ืžื” ืกื™ื‘ื•ืช:
1) overhead
2) cache- ืœื›ืœ thread ืฉืจืฅ ืขืœ ืžืขื‘ื“ ืื—ืจ ื™ืฉ L1 cache ืžืฉืœื•. ื ื–ื›ื™ืจ ืฉื›ืืฉืจ ืงื•ืจืื™ื ืžื™ื“ืข ืžื”ื–ื›ืจื•ืŸ ื•ืฉืžื™ื ื‘ cache ืฉืžื™ื ื’ื ืืช ื”ืžื™ื“ืข ืžืกื‘ื™ื‘ ื‘ืื•ืชื• cache line ื›ื“ื™ ืœืฉืžื•ืจ ืขืœ ืœื•ืงืœื™ื•ืช ื’ื‘ื•ื”ื”. ื‘ืชื›ื ื•ืช ืžืงื‘ื™ืœื™ multicore ื›ืืฉืจ thread ืžืฉื ื” ืืช ื”cache ื‘ืžืขื‘ื“ ืฉืœื•, ื”ื—ื•ืžืจื” ืžื•ื“ื™ืขื” ืœืฉืืจ ื”ืžืขื‘ื“ื™ื ืฉื”cache ื”ืฉืชื ื” ื•ืžืชืจื—ืฉ cache invalidation ืžื” ืฉื’ื•ืจื ืœื›ืœ L1 cache ืฉืœ ื”threads ืœื’ืฉืช ืฉื•ื‘ ืœRAM ืขืœ ื›ืœ ืคืขื•ืœืช ื—ื™ื‘ื•ืจ ืžื” ืฉื™ื•ืฆืจ ื—ื•ืกืจ ื ื™ืฆื•ืœ ืฉืœ ื”cache .

ืžื” ื”ืคืชืจื•ืŸ?
ื ื•ื›ืœ ืœืฉื ื•ืช ืืช Partial sum ืœืขื‘ื•ืจ ืขื ืžืขืจืš ื“ื• ืžื™ืžื“ื™ ื•ื›ืœ threads ื™ืงื‘ืœ ืฉื•ืจื” ืžืฉืœื• ื•ื›ื›ื” ืœื ื™ื”ื™ื” ืžืฆื‘ ืฉthreads ื“ื•ืจืกื™ื ืืช ื”lines ืื—ื“ ืฉืœ ื”ืฉื ื™.

Pasted image 20240308152659.png

ื ื™ืชืŸ ืœืจืื•ืช ืฉื–ื” ืื›ืŸ ืžืฉืคืจ ืžืฉืžืขื•ืชื™ืช ืืช ื”ื‘ื™ืฆื•ืขื™ื ืฉืœ ืชื•ื›ื ื™ืช ืžืงื‘ื™ืœื™ืช
Pasted image 20240308152712.png

PI Approximation

ืื ื—ื ื• ื™ื•ื“ืขื™ื ืฉืžืชืงื™ื™ื

โˆซ014.01+x2dx=ฯ€

ื›ืœื•ืžืจ ื ื•ื›ืœ ืœืงื‘ืœ ืงื™ืจื•ื‘ ืฉืœ ฯ€ ื‘ืืžืฆืขื•ืช ื—ื™ืฉื•ื‘ ืฉืœ ื”ืื™ื ื˜ื’ืจืœ ื›ืกื›ื•ื ืฉืœ N ืฉื˜ื—ื™ ืžืœื‘ื ื™ื

โˆ‘i=0nF(xi)โ‹…ฮ”xโ‰ˆฯ€

Screenshot 2024-03-08 at 17.55.15.png|300

ื ื›ืชื•ื‘ ืงื•ื“ ืžืงื‘ื™ืœื™ ืฉืžืืคืฉืจ ืœื ื• ืœื—ืฉื‘ ืืช ื–ื” ื•ื ื‘ื—ืŸ ืืช ื”ื‘ื™ืฆื•ืขื™ื ืฉืœื• ืœืขื•ืžืช ืงื•ื“ ืกื“ืจืชื™

ื”ื—ื™ืฉื•ื‘ ื”ืžืงื‘ื™ืœื™ ื™ื”ื™ื” ื“ื•ืžื” ืœืžื” ืฉืขืฉื™ื ื• ื‘ื“ื•ื’ืžืื•ืช ื”ืงื•ื“ืžื•ืช, ื›ืœ thread ื™ื‘ืฆืข ืกื›ื•ื ื—ืœืงื™ ืฉืœ ืฉื˜ื—ื™ ื”ืžืœื‘ื ื™ื ื•ืœื‘ืกื•ืฃ ื‘ืงื˜ืข ืงื•ื“ ืกื“ืจืชื™ ื™ืกื›ื•ื ืืช ื›ืœ ื”ืกื›ื•ืžื™ื ื”ื—ืœืงื™ื™ื. ืืช ื”ืกื›ื•ืžื™ื ื ื™ืชืŸ ืœืฉืžื•ืจ ื‘ืืžืฆืขื•ืช ืžืขืจืš ืฉื’ื•ื“ืœื• ื›ืžืกืคืจ ื”threads ื•ื›ื›ื” ื›ืœ thread ืžืงื‘ืœ ืžืฉื‘ืฆืช ืื—ืช ื‘ืžืขืจืš ืžื‘ืœื™ ืœื“ืื•ื’ ืœrace condition (ื›ืคื™ ืฉื”ืจืื ื•, ื–ื” ื™ื›ื•ืœ ืœืคื’ื•ืข ื‘ื‘ื™ืฆื•ืขื™ื ืื ืœื ืœื•ืงื—ื™ื ื‘ื—ืฉื‘ื•ืŸ ืืช ื”ืžื˜ืžื•ืŸ).

Pasted image 20240308180620.png|300

ื‘ืชื›ื ื•ืช ืžืงื‘ื™ืœื™ ื–ื” ื™ื™ืจืื” ื›ื›ื”

#include <omp.h>
#include<stdio.h>

	#define N 2 // required number of threadsint main ()
	
    int N_actual; // actual number of threads
ย  ย  long num_steps = 100000;
ย  ย  double sum[N] = {0};
ย  ย  double step = 1.0/(double) num_steps;

    omp_set_num_threads(N);
    #pragma omp parallel
    {
        int id = omp_get_thread_num();
        double x;
        #pragma omp single
           N_actual = omp_get_num_threads();

ย  ย      for (int i=id; i< num_steps; i+= N_actual) {
ย  ย  ย  ย      x = (i+0.5)*step;
ย  ย  ย  ย      sum[id] += 4.0/(1.0+x*x);
ย  ย      }
    }
ย  ย  double pi = 0;
    for (int i=0; i < N_actual; i++)
        pi += step * sum[i];
ย  ย  printf("%f\n", pi);
ย  ย  return 0;
}

ืื ื ืžื“ื•ื“ ื‘ื™ืฆื•ืขื™ื ื ืงื‘ืœ
Pasted image 20240308181117.png|300

ื ื™ืชืŸ ืœืจืื•ืช ืฉื”ื—ืœ ืž N=3 ื™ืฉ ื™ืจื™ื“ืช ื‘ื™ืฆื•ืขื™ื. ื”ืกื™ื‘ื” ืœื›ืš ื”ื™ื ืžื›ื™ื•ื•ืŸ ืฉื‘ืžืขื‘ื“ ืื—ื“ ื™ืฉ ื›ืžื” ืœื™ื‘ื•ืช ื•ื‘ื›ืœ ืœื™ื‘ื” ื ื™ืชืŸ ืœื”ืจื™ืฅ ืžืกืคืจ HW threads (ื‘ื“ืดื› 2) ื•ืื—ื“ ื”ืคืชืจื•ื ื•ืช ื”ื—ื•ืžืจืชื™ื™ื ืฉื™ืฉ ืœืžืขื‘ื“ื™ื ืžืจื•ื‘ื™ ืœื™ื‘ื•ืช ื”ื•ื ืฉื‘ืจื’ืข ืฉื™ืฉ ืฉื ื™ cores ืขืœ ืื•ืชื” ืชื•ื›ื ื™ืช ื•ื” L1 cache ืฉืœ ืื—ื“ ืžื”ื ืžืฉืชื ื” , ื”ื•ื ืžื•ื“ื™ืข ืœL1 cache ืฉืœ ื”ืื—ืจื™ื ืฉื”ื invalid ื•ืขืœื™ื”ื ืœืขื“ื›ืŸ ืืช ื”ืžื˜ืžื•ืŸ ืฉืœื”ื.

ื ื™ืงื— ื‘ื™ืฆื•ืข ืฉืœ ืืจื‘ืขื” threads . ืžื›ื™ื•ื•ืŸ ืฉื›ืœ core ืžื›ื™ืœ ืฉื ื™ HR (Hardware threads), ื”ื•ื ื™ื›ื•ืœ ืœื”ืจื™ืฅ ืฉื ื™ threads. ืืจื‘ืขื” threads ื™ืจื•ืฆื• ืขืœ ืฉื ื™ cores ืฉื•ื ื™ื. ืœื›ืœ core ื™ืฉ L1 cache ืคืจื˜ื™ ืžืฉืœื•. ื›ืืฉืจ thread0 ืจื•ืฆื” ืœืขื“ื›ืŸ sum[0], ื—ืœืง ืฉืœ RAM ื‘ื’ื•ื“ืœ cache line (ื‘ื’ื•ื“ืœ 64 bites ืขื‘ื•ืจ 64-bit architecture) ืžื•ื‘ื ืœ- L1 cache ืฉืœ core ืืฉืจ ืžืจื™ืฅ ืืช thread1.

Pasted image 20240308182134.png|300

ื›ืืฉืจ thread2 ืจื•ืฆื” ืœืขื“ื›ืŸ sum[2], ื—ืœืง ืฉืœ RAM ื‘ื’ื•ื“ืœ cache line ืžื•ื‘ื ืœ- L1 cache ืฉืœ core ืืฉืจ ืžืจื™ืฅ ืืช thread2. ื ืฉื™ื ืœื‘ ืฉืฉื ื™ cache lines ืžื›ื™ืœื•ืช ืืช ื”ืžืขืจืš ืฉืœื•. ื‘ืจื’ืข ืฉ- thread2 ืžื‘ืฆืข ืขื“ื›ื•ืŸ ืฉืœ sum[2], ื”- cache line ื”ื–ื” ื‘- L1 cache of Core1 ืžืกื•ืžืŸ ื›- invalid, ื›ืœื•ืžืจ, ืœื ืœืฉื™ืžื•ืฉ. ื–ื” ืžื›ื™ื•ื•ืŸ ืฉื”ื•ื ืžื›ื™ืœ data ืœื ืขื“ื›ื ื™. ื›ืืฉืจ thread0 ื™ืจืฆื” ืœืขื“ื›ืŸ ืฉื•ื‘ ืคืขื ืืช sum[0], ื”ื•ื ื™ืฆื˜ืจืš ืœื‘ืฆืข ื’ื™ืฉื” ืœ- RAM ื›ื“ื™ ืœื”ื‘ื™ื ืืช ื”- cache line ืžื—ื“ืฉ.

Pasted image 20240308182644.png|300

ื”ืคืชืจื•ืŸ ื”ื•ื ื”ืงืฆืื” ืฉืœ ืžืขืจืš ื“ื• ืžื™ืžื“ื™ ืฉืชืžื ืข ื—ืคื™ืคื” ื‘ื™ืŸ cache lines ืฉืœ threads ืฉื•ื ื™ื. ืื•ืžื ื, ื–ื” ื‘ื–ื‘ื•ื– ื–ื›ืจื•ืŸ RAM, ืื‘ืœ ื–ื›ืจื•ืŸ ื”ื•ื ืžื“ื“ ืžืฉื ื™ ืœืขื•ืžืช ื–ืžืŸ ืจื™ืฆื”.

#include <omp.h>
#include<stdio.h>

#define N 2 // required number of threads
#define PAD 8 // assume 64-byte L1 cache line size
int main () { 
    int N_actual; // actual number of threads
ย  ย  long num_steps = 100000000;
ย  ย  double sum[N][PAD] = 0;
ย  ย  double step = 1.0/(double) num_steps;

    double tdata = omp_get_wtime();
    omp_set_num_threads(N);
    #pragma omp parallel
    {
        int id = omp_get_thread_num();
        double x;
        #pragma omp single
           N_actual = omp_get_num_threads();

ย  ย      for (int i=id; i< num_steps; i+= N_actual) {
ย  ย  ย  ย      x = (i+0.5)*step;
ย  ย  ย  ย      sum[id][0] += 4.0/(1.0+x*x);
ย  ย      }
    }
ย  ย  double pi = 0;
    for (int i=0; i < N_actual; i++)
        pi += step * sum[i][PAD];
    tdata = omp_get_wtime() - tdata;
    printf("pi = %f in %f secs\n", pi, tdata);
ย  ย  return 0;
}

ื›ื™ื•ื•ืŸ ืฉื›ืœ thread ืขื•ื‘ื“ ืขืœ ืฉื•ืจื” ืฉืœื• ื‘ืžืขืจืš ื‘ื”ืชืื ืœ id, ืœื›ืŸ ืื™ืŸ ืฉื•ืจื•ืช cache ืžืฉื•ืชืคื•ืช.

Pasted image 20240308183013.png|300

ื”ื–ืžืŸ ืจื™ืฆื” ืื›ืŸ ื”ืฉืชืคืจ ื’ื ืขื‘ื•ืจ N=3,4. ื”ืžื—ื™ืจ ืฉืงื™ื‘ืœื ื• ื”ื•ื ืงื•ื“ ืžืื•ื“ ืžืกื•ืจื‘ืœ. ื ื ืกื” ืœื‘ื—ื•ืŸ ื“ืจื›ื™ื ื ื•ืกืคื•ืช.

ื›ืœื™ ื ื•ืกืฃ ื”ื•ื ื”ื’ื“ืจืช critical section ื‘ืชื•ืš ื”ืงื•ื“ ื”ืžืงื‘ื™ืœื™ , ื‘ืžืงื•ื ืœื”ื—ื–ื™ืง ืžืขืจืš ืฉืœ ืกื›ื•ืžื™ื ื—ืœืงื™ื™ื ื ื•ื›ืœ ืœื“ืื•ื’ ืฉื›ืœ thread ื™ื—ืฉื‘ ืืช ื”ืกื›ื•ื ืฉืœื• ื‘ืฆื•ืจื” ืžืงื•ืžื™ืช ื•ื‘ืงื˜ืข ื”ืงื•ื“ ื”ืงืจื™ื˜ื™ ื ืกื›ื•ื ืืช ื”ืชื•ืฆืื” ื”ืกื•ืคื™ืช ืœืขืจืš ืฉืœ ฯ€ .

#include <omp.h>
#include<stdio.h>

#define N 2 // required number of threadsint main () { 
ย  ย  long num_steps = 100000000;
ย  ย  double pi = 0;
ย  ย  double step = 1.0/(double) num_steps;

    double tdata = omp_get_wtime();
    omp_set_num_threads(N);
    #pragma omp parallel
    {
        int id = omp_get_thread_num();
        double x, sum = 0;
        #pragma omp single
           N_actual = omp_get_num_threads();

ย  ย      for (int i=id; i< num_steps; i+= N_actual) {
ย  ย  ย  ย      x = (i+0.5)*step;
            sum += 4.0/(1.0+x*x);
        }
ย  ย  ย  ย  #pragma omp critical
            pi += sum * step;
    }
    tdata = omp_get_wtime() - tdata;
    printf("pi = %f in %f secs\n", pi, tdata);
ย  ย  return 0;
}

ื‘ืฉื™ื˜ื” ื–ื• ื ืงื‘ืœ ื‘ื™ืฆื•ืขื™ื ื“ื•ืžื™ื ืœืฉื™ื˜ื” ื”ืงื•ื“ืžืช ื‘ืงื•ื“ ื™ื•ืชืจ ืžืกื•ื“ืจ.

Loops

ื›ืžื” ื›ืœืœื™ ืœื•ืœืื•ืช ื—ืฉื•ื‘ื™ื ื‘ OpenMP:

  1. ืื for loop ืžื•ืคื™ืข ื‘ืงื˜ืข ืงื•ื“ ืžืงื‘ื™ืœื™, ืื– ื›ืœ ืื—ื“ ืžื” threads ื™ื‘ืฆืข ืืช ื”ืœื•ืœืื” ื‘ืžืœื•ืื”.
#include <omp.h>
#include<stdio.h>

int main () { 
    #pragma omp parallel
    {
        for(int i=0; i<2; i++)
            printf("i=%d, ID=%d\n", i, omp_get_thread_num());
    }
    return 0;
}

  1. ืื ืจื•ืฆื™ื ืœื‘ืฆืข ืœื•ืœืื” ืคืขื ืื—ืช ื‘ืœื‘ื“, ืื‘ืœ ืœื—ืœืง ืื•ืชื” ืœื—ืœืงื™ื ื›ืš ืฉื›ืœ ื—ืœืง ื™ืชื‘ืฆืข ืขืœ ื™ื“ื™ thread ืื—ืจ ืื– ื ื™ืชืŸ ืœื”ืฉืชืžืฉ ื‘pragma omp for ื‘ืชื•ืš parallel block ืื•pragma omp parallel for.
#include <omp.h>
#include<stdio.h>

int main () { 
    #pragma omp parallel
    {
          #pragma omp for
            for(int i=0; i<2; i++)
                printf("i=%d, ID=%d\n", i, omp_get_thread_num());
    }
    return 0;
}

//or use:
int main () { 
    #pragma omp parallel for
     for(int i=0; i<2; i++)
        printf("i=%d, ID=%d\n", i, omp_get_thread_num());
  
    return 0;
}

ื ืฉื™ื ืœื‘ ืฉื‘ื“ื•ื’ืžืช ื”ืงื•ื“ ื”ืื—ืจื•ืŸ ื ืงื‘ืœ 2 threads ื•ืœื 4 ื›ืžื• ื‘ืจื•ื‘ ื”ืžืงืจื™ื ื”ื“ื™ืคื•ืœื˜ื™ื ื‘ื’ืœืœ ืฉื”ืงื•ืžืคื™ื™ืœืจ ื‘ืฉื™ืชื•ืฃ ืขื ืžืขืจื›ืช ื”ื”ืคืขืœื” ื”ื—ืœื™ื˜ื• ืœื”ืงืฆื•ืช ืคื—ื•ืช ืžืžื” ืฉื”ื•ื ื™ื›ื•ืœ ื›ื™ ืžืกืคืจ ื”ืื™ื˜ืจืฆื™ื•ืช ื”ื•ื 2.

Nested loops

ื›ืืฉืจ ื™ืฉ nested loop ืฉื›ืชื•ื‘ ื‘ืชื•ืš ื‘ืœื•ืง ืžืงื‘ื™ืœื™, ืจืง ื”ืœื•ืœืื” ื”ื—ื™ืฆื•ื ื™ืช ืชืชืžืงื‘ืœ.

Pasted image 20240308213755.png|350

ื–ืืช ื”ื”ืชื ื”ื’ื•ืช ื”ื“ื™ืคื•ืœื˜ื™ื‘ื™ืช ืื ื ืจืฆื” ืœืฉื ื•ืช ืื•ืชื” ื ื•ื›ืœ ืœื”ืฉืชืžืฉ ื‘ืžื™ืœื” ื”ืฉืžื•ืจื” collaps(#loops). ื–ื” ืื•ืžืจ ืœืงื•ืžืคื™ื™ืœืจ ืฉืื ื—ื ื• ืจื•ืฆื™ื ืœืžืงื‘ืœ ืืช ื›ืœ ื”ืœื•ืœืื•ืช ื”ืคื ื™ืžื™ื•ืช ืขื“ ื”ืจืžื” ืฉื”ื™ื ื”ืžืกืคืจ ืฉื”ืขื‘ืจื ื•

Pasted image 20240308214014.png|350

ื‘ืžืงืจื” ื–ื”, ื›ืœ thread ื‘ื™ืฆืข ืื™ื˜ืจืฆื™ื” ืื—ืช. ื ืฉื™ื ืœื‘ ืฉื‘ืงื•ื“ ื–ื” ื‘ื ื™ื’ื•ื“ ืœืงื•ื“ื ืœื ืคื’ืขื ื• ื‘ืจืžืช ื”ืžืงื‘ื™ืœื™ื•ืช ื•ื’ื ืœื ื‘ื ื›ื•ื ื•ืช ืฉืœ ื”ืงื•ื“.

End of loop barrier:
ื ืฉื™ื ืœื‘ ืฉื™ืฉ barrier ืœื ืžืคื•ืจืฉ ื‘ืกื•ืฃ ืœื•ืœืืช for ืžืงื‘ื™ืœื™ืช. ื›ืœื•ืžืจ threads ื™ื—ื›ื• ื‘ืกื•ืฃ ื”ืœื•ืœืื” ืœืฉืืจ ื”threads ืฉืจืฆื™ื ืฉื™ืกื™ื™ืžื• ืืช ื”ืœื•ืœืื” ื’ื ื›ืŸ.

Pasted image 20240308214542.png|350

ื ื™ืชืŸ ืœืจืื•ืช ืœืคื™ ื”ื”ื“ืคืกื” ืฉื›ืœ ื”threads ื—ื™ื›ื• ืขื“ ืฉื›ื•ืœื ื™ืกื™ื™ืžื• ืืช ื”ืœื•ืœืื” ื•ืจืง ืื– ื”ืชื—ื™ืœื• ื‘ื”ื“ืคืกื”.

ืื ื ืจืฆื” ืœื”ื•ืจื™ื“ ืืช ื”ืœื•ื’ื™ืงื” ื”ื–ืืช, ื ื•ื›ืœ ืœื”ื•ืกื™ืฃ ืืช ืžื™ืœืช ื”ืžืคืชื— ืœื™ื“ ื”ืฆื”ืจืช ื”parallel for ืฉื ืงืจื nowait. ื‘ืจื’ืข ืฉืžื•ืกื™ืคื™ื ืืช ื”ืžื™ืœื” ื”ื–ื• ืื– ื”ืงื•ืžืคื™ืœืจ ืœื ื™ืฆื™ื‘ ืžื—ืกื•ื ืกืžื•ื™ ื”ื™ื›ืŸ ืฉื”ืฆื”ืจื ื• (ืื ื™ืฉ ืขื•ื“ ื”ืฆื”ืจื•ืช ื‘ื”ืžืฉืš ืื– ืฉื ืœื ื™ื‘ื•ื˜ืœ ื”ืžื—ืกื•ื).

Pasted image 20240308214759.png|350

Reduction

ื ืกืชื›ืœ ืขืœ ื”ืงื•ื“ ื”ื‘ื

#include <omp.h>
#include<stdio.h>

int main () { 
    int sum = 0;
    #pragma omp parallel 
    {
          #pragma omp for reduction (+:sum)
               for(int i=0; i<10; i++) {
                    sum += i;
                    printf("thread%d sum = %d\n", 
                              omp_get_thread_num(), sum);
               }
    }
    printf("sum = %d\n", sum);
    return 0;
}

ื ื ืกื” ืœื”ื‘ื™ืŸ ืžื” ืงื•ืจื” ื›ืืŸ ืขืœ ื™ื“ื™ ื”ืชื‘ื•ื ื ื•ืช ื‘ื”ื“ืคืกื•ืช

Pasted image 20240308215107.png|250

ืื ื›ืŸ, ืžื”ื• ื”ื›ืœื™ reduction?
ื”ื›ืœื™ ื”ื–ื” ืžืงื‘ืœ ื–ื•ื’ ืกื“ื•ืจ ืžื”ืฆื•ืจื” (op:var), ื›ืœ ืื—ื“ ืžื”threads ืžืงื‘ืœ ืขื•ืชืง ืคืจื˜ื™ ืฉืœ ืžืฉืชื ื” var ืฉืžืื•ืชื—ืœื™ื ืœืคื™ ืกื•ื’ ืคืขื•ืœื” op. ืœืื—ืจ ืžื›ืŸ ื”ืคืขื•ืœื” ืžืชื‘ืฆืขืช ื‘ื›ืœ ืื™ื˜ืจืฆื™ื” ืžืงื•ืžื™ืช ืขื‘ื•ืจ ื”thread. ื‘ืกื•ืฃ ื›ืœ ื”ืžืฉืชื ื™ื ื”ืคืจื˜ื™ื™ื var ืžืกืชืžื›ื™ื ืขืœ ื™ื“ื™ ื”ืคืขื•ืœื” op ืœืชื•ืš ื”ืžืฉืชื ื” ื”ืžืฉื•ืชืฃ.

ืืœื” ื”ืŸ ื”ืคืขื•ืœื•ืช ื”ืืคืฉืจื™ื•ืช ื•ืขืจืš ื”ืืชื—ื•ืœ ืขื‘ื•ืจืŸ-

Pasted image 20240308220232.png|200

ื ืจืื” ื›ืขืช ื›ื™ืฆื“ ืœืžืงื‘ืœ ืืช ืงื•ื“ ื” Pi Approximation ืžืžืงื•ื“ื ื‘ืืžืฆืขื•ืช reduction

Pasted image 20240309003624.png
ื›ืขืช ื ื™ืชืŸ ืœืจืื•ืช ืื™ืš ื‘ืžื™ืžื•ืฉ ืžืื•ื“ ื ืงื™ ื—ื™ืฉื‘ื ื• ืืช ื”ืงื™ืจื•ื‘ ืฉืœ ฯ€ ื‘ื ื™ื’ื•ื“ ืœืงื•ื“ ื”ืงื•ื“ื ืฉื”ื™ื” ืžืื•ื“ ืžืกื•ืจื‘ืœ ื•ืœื ื ืงื™.

Reduction with Arrays:
Pasted image 20240309003806.png
ื ื™ืชืŸ ืœื‘ืฆืข reductio ื’ื ืขืœ ืžืขืจื›ื™ื ื•ืœื™ื™ืฆืจ ืฉื›ืคื•ืœ ืฉืœื”ื ืœื›ืœ thread.

Schedule

ื ืกืชื›ืœ ืขืœ ืงื˜ืข ื”ืงื•ื“ ื”ื‘ื
Pasted image 20240308222126.png
ื”ืœื•ืœืื” ื”ื—ื™ืฆื•ื ื™ืช ืžืชื—ืœืงืช ืฉื•ื•ื” (ื‘ื™ื—ืก ืœืžืกืคืจ ื”ืื™ื˜ืจืฆื™ื•ืช) ื‘ื™ืŸ ืืจื‘ืขืช ื”threads ืฉื ื•ืฆืจื•.

ื”ืœื•ืœืื” ื”ืคื ื™ืžื™ืช ื”ื™ื ืžืงื•ืžื™ืช ืœื›ืœ threads ื›ืคื™ ืฉืืžืจื ื•.
ื ืฉื™ื ืœื‘ ืฉืื•ืžื ื ื”ื—ืœื•ืงื” ืฉืœ ืžืกืคืจ ื”ืื™ื˜ืจืฆื™ื•ืช ืฉื•ื•ื”, ืื‘ืœ ื›ืœ ืื—ื“ ืžื”threads ืžืจื™ืฅ ืžืจื™ืฅ ืขื ืขืจื›ื™ i ืฉื•ื ื™ื. ืœื›ืŸ ืชื”ื™ื” ืคื” ื‘ืขื™ื” ืฉthreads ืฉืžืงื‘ืœื™ื ืืช 3 ื”ืื™ื˜ืจืฆื™ื•ืช ื”ื’ื‘ื•ื”ื•ืช ื™ื•ืชืจ ื™ื‘ืฆืขื• ื™ื•ืชืจ ืคืขื•ืœื•ืช ื—ื™ืฉื•ื‘ื™ื•ืช ืžืื•ืคื™ ื”ื’ื“ืจืช ื”ืœื•ืœืื” ื”ืคื ื™ืžื™ืช.

Pasted image 20240308222745.png
ื”ื”ื™ื’ื™ื•ืŸ ืฉืœ ื”ืœื•ืœืื” ื”ื–ืืช ืคื’ืข ื‘ load balance ืฉื›ืŸ ื”ืขื‘ื•ื“ื” ืœื ืžืชื—ืœืงืช ืฉื•ื•ื” ื‘ืฉื•ื•ื”.

Static Scheduling

ื›ื“ื™ ืœื˜ืคืœ ื‘ื–ื” ื ืฉืชืžืฉ ื‘key work ืฉื ืงืจื schedule. ื”ืžื™ืœื” ื”ืฉืžื•ืจื” ื”ื–ื• ืžืงื‘ืœ ื›ืคืจืžื˜ืจ ืืช ืื•ืคื™ ื”ื”ืงืฆืื” (static ืื• dynamic ื•ื™ืฉ ื’ื ื ื•ืกืคื™ื) ื•ืคืจืžื˜ืจ ื ื•ืกืฃ ืฉื”ื•ื ืžืกืคืจ ืฉืงื•ื‘ืข ืดื›ืžื” ืื™ื˜ืจืฆื™ื•ืช ื ืจืฆื” ืœื”ืงืฆื•ืช ืœื›ืœ thread ืขื‘ื•ืจ ืจื™ืฆื” ืื—ืช ืœืคื ื™ ืฉื”ื•ื ืดืžื—ืœื™ืฃืด ืœthread ื”ื‘ื.
Pasted image 20240308223157.png
ื›ืืฉืจ ื ืฉื™ื 1 ื‘ืงื•ื“ ื”ื ืดืœ ื ืงื‘ืœ ื”ืชื ื”ื’ื•ืช ืฉืžื–ื›ื™ืจื” Round Robin, ื›ืœ thread ื™ืจื•ืฅ ืื™ื˜ืจืฆื™ื” ืื—ืช ื•ืื– ื”thread ื”ื‘ื ื™ืจื™ืฅ ืืช ื”ืื™ื˜ืจืฆื™ื” ื”ื‘ืื” ื‘ืชื•ืจ ื•ื›ืŸ ื”ืœืื”.

Pasted image 20240308223446.png

ืื ื”ื™ื™ื ื• ืžื’ื“ื™ืœื™ื ืืช ื”ืคืจืžื˜ืจ ืž 1 ืœ 2 ื”ื™ื™ื ื• ืžืงื‘ืœื™ื load balance ืคื—ื•ืช ื˜ื•ื‘:

Pasted image 20240308224048.png

ืื™ืŸ ื‘ืืžืช ืชืฉื•ื‘ื” ื ื›ื•ื ื” ืœืงื‘ื™ืขืช ื’ื•ื“ืœ chunk ืžืชืื™ื ืืš ื—ืฉื•ื‘ ืœืฆื™ื™ืŸ ืฉืฉื™ืžื•ืฉ ื‘ static scheduling ื”ื•ื ืžืื•ื“ ื™ืขื™ืœ ื›ื™ ืื™ืŸ ืฆื•ืจืš ื‘ื›ืœืœ ืœืชืงืฉื•ืจืช ื‘ื™ืŸ ื”threads ืฉื›ืŸ ื‘ื–ืžืŸ ื”ืจื™ืฆื” ื›ืœ ืื—ื“ ื™ื•ื“ืข ื‘ื™ื“ื™ื•ืง ืžื” ืขืœื™ื• ืœื”ืจื™ืฅ ื•ื›ืžื” ืื™ื˜ืจืฆื™ื•ืช.

Caution

ื—ืœื•ืงื” ืกื˜ื˜ื™ืช ื™ื›ื•ืœื” ืœื”ื™ื•ืช ืœื ืžื•ืฆืœื—ืช ื›ืืฉืจ ืœื ื ื™ืชืŸ ืœื“ืขืช ืžืจืืฉ ื›ืžื” ื—ื™ืฉื•ื‘ื™ื ื™ืฉ ื‘ื›ืœ ืื™ื˜ืจืฆื™ื” ืœืžืฉืœ ืื ื‘ืงื•ื“ ืœืžืขืœื” ื”ื™ื™ื ื• ืžื’ื“ื™ืจื™ื ื‘ืžืงื•ื ืœืจื•ืฅ ืขื“ i ืœืจื•ืฅ ืขื“ rand ื”ื™ื™ื ื• ื™ื›ื•ืœื™ื ืœืงื‘ืœ ื—ืœื•ืงื” ืœื ืฉื™ื•ื•ื™ื•ื ื™ืช ืฉืคื•ื’ืขืช ื‘ balance.
Pasted image 20240308224623.png

Dynamic Scheduling

ืœืžืงืจื™ื ื›ืžื• ื”ื ืดืœ ื ื™ืชืŸ ืœื”ื’ื“ื™ืจ ื’ื ืฉื”ื—ืœื•ืงื” ืชื”ื™ื” dynamic.
ื‘ืžืฆื‘ ื–ื” ื›ืœ thread ืžืงื‘ืœ chunk ืฉืœ ืื™ื˜ืจืฆื™ื•ืช ืœืคื™ ื”ืคืจืžื˜ืจ ืฉื”ื’ื“ืจื ื• ื•ืžืกืคืจ ื”ืื™ื˜ืจืฆื™ื•ืช ื‘ืœื•ืœืื” ื”ื—ื™ืฆื•ื ื™ืช ื•ืžื‘ืฆืข ืื•ืชืŸ ืื‘ืœ ื”ื•ื ืžื—ืœืง ืืช ื”ืขื‘ื•ื“ื” ื‘ืฆื•ืจื” ื“ื™ื ืžื™ืช ื‘ืœื™ ื‘ื”ื›ืจื— ืกื“ืจ ืกืคืฆื™ืคื™ ืืœื ืขื ืืœื’ื•ืจื™ืชื ืฉืžื–ื›ื™ืจ Thread Pool.

int main()
{
#pragma omp parallel for schedule(dynamic, 1) 
for (int i = 0; i < 20; i++)
	{
		printf("Thread %d is running number %d\n",omp_get_thread_num(),i);
	}
	return 0;
}

ื”ืคืœื˜ ื™ื”ื™ื” :

Thread 1 is running number 2
Thread 1 is running number 7
Thread 1 is running number 9
Thread 1 is running number 10
Thread 1 is running number 11
Thread 1 is running number 13
Thread 1 is running number 14
Thread 1 is running number 15
Thread 1 is running number 17
Thread 1 is running number 19
Thread 3 is running number 0
Thread 0 is running number 4
Thread 8 is running number 12
Thread 4 is running number 3
Thread 6 is running number 6
Thread 9 is running number 16
Thread 5 is running number 1
Thread 7 is running number 8
Thread 10 is running number 18
Thread 2 is running number 5

ื—ืฉื•ื‘ ืœืฆื™ื™ืŸ ืฉื”ื”ืงืฆืื” ื”ื™ื ื—ืžื“ื ื™ืช ื‘ืฉื™ื˜ื” ืฉืœืคื™ื” ืขื•ื‘ื“ ื”ืชื–ืžื•ืŸ ื”ื“ื™ื ืžื™ ื”ื•ื First Come First Serve ื›ืœื•ืžืจ, ืื ื™ืฉื ืŸ ืื™ื˜ืจืฆื™ื•ืช ืฉืฆืจื™ืš ืœื‘ืฆืข ื•ื™ืฉ thread ืคื ื•ื™ ืฉื™ื›ื•ืœ ืœืงื—ืช ืื•ืชืŸ, ื”ื•ื ื™ื™ืงื— ืื•ืชืŸ.

ื›ืžื• ื›ืŸ, ื”ืชื–ืžื•ืŸ ื”ื“ื™ื ืžื™ ื‘ืื•ืคืŸ ื“ื™ ืื™ื ื˜ื•ืื™ื˜ื™ื‘ื™, ื”ืจื‘ื” ื™ื•ืชืจ ื™ืงืจ ืžื”ืกื˜ื˜ื™ ื›ื™ ื ื“ืจืฉ ืชืงืฉื•ืจืช ื‘ื™ืŸ ื” threads ืœืื—ืจ ื›ืœ ืื™ื˜ืจืฆื™ื”, ื›ืžื• ื›ืŸ ืœืขื™ืชื™ื ืฆืจื™ืš ืœื”ื’ื“ื™ืจ ืืช ื”chunk size ืฉืœ thread ื›ืœืฉื”ื• ื•ื’ื ื‘ื’ืœืœ ืฉื”ืชื–ืžื•ืŸ ื”ื•ื ืžืขื™ืŸ ืงื•ืคืกื ืฉื—ื•ืจื” ืฉืœ ื”ืกืคืจื™ื™ื” ืœื ื‘ื”ื›ืจื— ืฉื ืงื‘ืœ ืชื–ืžื•ืŸ ืฉืžืชืื™ื ืœื ื•.

Shared/Private variables

ื›ืืžื•ืจ ืžืฉืชื ื™ื ื”ืžื•ื’ื“ืจื™ื ืžื—ื•ืฅ ืœืงื˜ืข ื”ืžืงื‘ื™ืœื™ ื”ื shared.
ืžืฉืชื ื™ื ื”ืžื•ื’ื“ืจื™ื ื‘ืชื•ืš ืงื˜ืข ืžืงื‘ื™ืœื™ ื”ื private, ื›ืœื•ืžืจ ื›ืœ thread ืžืงื‘ืœ ืขื•ืชืง ืฉืœ ืžืฉืชื ื™ื ืืœื•.

Pasted image 20240308230822.png|250

ื ืฉื™ื ืœื‘ ืฉื’ื ืžืฉืชื ื™ื ืกื˜ื˜ื™ื™ื ื‘ืชื•ืš ื‘ืœื•ืง ืžืงื‘ื™ืœื™, ื”ื shared.

ื ื™ืชืŸ ืœืฉื ื•ืช ืืช ื ืจืื•ืช ื‘ืจื™ืจืช ืžื—ื“ืœ ืฉืœ ื”ืžืฉืชื ื™ื:
Pasted image 20240308231021.png
i ื”ื•ืคืš ืื•ื˜ื•ืžื˜ื™ ืœืžืฉืชื ื” ืคืจื˜ื™ ื‘ืจื’ืข ืฉืฉืžื™ื ืื•ืชื• ื‘ืชื•ืš parallel for ื• j ื”ืคืš ืœืคืจื˜ื™ ื‘ื’ืœืœ ืฉื”ืฉืชืžืฉื ื• ื‘ื”ื’ื“ืจื” private(j).

ื”ืกื™ื‘ื” ืฉื”ื’ื“ืจื ื• ืืช ื”ืžืขืจื›ื™ื ื›static ื›ื“ื™ ืฉื”ื ื™ืฉื‘ื• ื‘ global data segment ื•ืœื ื‘stack ื•ื›ืš ื ืžื ืข ืž seg fault (ืžืขืจื›ืช ื”ื”ืคืขืœื” ืžื’ื“ื™ืจื” ืœ stack ื’ื•ื“ืœ ืงื‘ื•ืข ื•ืœืคืขืžื™ื ื”ื™ื ืœื ืชืงืฆื” ืขื•ื“ ืขื‘ื•ืจ ื”stack , ื ื™ืชืŸ ืœื‘ื“ื•ืง ืืช ื’ื•ื“ืœ ื”stack ื”ืงื‘ื•ืข ืœื›ืœ ืชื•ื›ื ื™ืช ืขืœ ื™ื“ื™ ื”ืจืฆืช ื”ืชื•ื›ื ื™ืช ulimit -s).

Screenshot 2024-03-08 at 23.17.22.png

ื ืฉื™ื ืœื‘: ื”ื”ืฆื”ืจื” private ืžื™ื™ืฆืจืช ืขื•ืชืง ืฉืœ ื”ืžืฉืชื ื” ื•ืœื›ืŸ ื ื•ืฆืจ ืขื•ืชืง ืฉืœ ื”ืžืฉืชื ื” ื‘ืชื•ืš ื”threads, ื”ืขืจืš ืฉืœ ื”ืขื•ืชืงื™ื ืœื ื‘ื ืœื™ื“ื™ ื‘ื™ื˜ื•ื™ ื‘ืžืฉืชื ื” ื”ืžืงื•ืจื™ ื›ืœื•ืžืจ ื›ืœ ืฉื™ื ื•ื™ ื‘ืชื•ืš ื”ืงื•ื“ ื”ืžืงื‘ื™ืœื™ ืœื ืžืฉืคื™ืข ืขืœ ื”ืžืฉืชื ื” ื”ืžืงื•ืจื™. ื›ืžื• ื›ืŸ ืื™ืŸ ื”ื‘ื˜ื—ื” ืฉื’ื ื”ืขืจืš ืฉืœ ื”ืžืฉืชื ื” ื”ืžืงื•ืจื™ ื™ืขื‘ื•ืจ ืœืขื•ืชืง ืœืžืฉืœ:

Pasted image 20240309005307.png
ืื•ืžื ื ื”ืขืจืš ื”ื™ื” 1, ืื‘ืœ ื›ืœ ืื—ื“ ืžื”threads ืงื™ื‘ืœ ืขืจืš ื”ืชื—ืœืชื™ ืฉืœ 0 ืขื‘ื•ืจ ื”ืขื•ืชืง ืฉืœ sum.

ืืคืฉืจ ืœื”ื’ื“ื™ืจ ืžืฉืชื ื” ืœื”ื™ื•ืช private ืจืง ืœื—ืœืง ืžืกื•ื™ื™ื ืžืชื•ืš ื”ืงื•ื“ ื”ืžืงื‘ื™ืœื™.
Pasted image 20240309005450.png

ื‘ืžืงืจื” ื”ื–ื” ื ื™ืชืŸ ืœืจืื•ืช ืื™ืš ืœืื—ืจ ื”ื™ืฆื™ืื” ืžื”ืœื•ืœืื” sum ื ืฉืืจ ืขื ืขืจืš 1.

Pasted image 20240309005812.png

Pasted image 20240309005924.png

ืžื” ื”ืขืจืš ืฉื™ื•ื“ืคืก ื›ืืŸ?
Pasted image 20240309005949.png
ื”firstprivate ืื•ืžืจ ืฉื›ืœ ื”threads ื™ืงื‘ืœื• ืืช ื”ืขืจืš ื”ื”ืชื—ืœืชื™ ื‘ืขื•ืชืง ืฉืœื”ื, ืฉื”ื•ื 0. ื”lastprivate ืื•ืžืจ ืฉื ืฉื™ื ื‘sum ื”ื—ื™ืฆื•ื ื™ ืืช ื”ืขืจืš ืฉืงื™ื‘ืœ ื”ืกื›ื•ื ื”ื—ืœืงื™ ืฉืœ ื”thread ื”ืื—ืจื•ืŸ. ื›ื™ื•ื•ืŸ ืฉื‘ื“ื™ืคื•ืœื˜ ืžื“ื•ื‘ืจ ื‘ื ื™ื”ื•ืœ ืกื˜ื˜ื™ ืื‘ืœ ื”ื—ืœื•ืงื” ื›ืืŸ ืœื ืฉื•ื•ื” (ื™ืฉ 9 ืื™ื˜ืจืฆื™ื•ืช ืขืœ 4 threads) ืœื ื ื™ืชืŸ ืœื—ื–ื•ืช ืืช ื”ืชื•ืฆืื” ืžืจืืฉ. ื ื ื™ื— ืฉ thread0 ื™ืงื‘ืœ ืืช 3 ื”ืื™ื˜ืจืฆื™ื•ืช ื”ืจืืฉื•ื ื•ืช ื•ื ื‘ื—ืŸ ืืช ื”ื”ื“ืคืกื•ืช

Pasted image 20240309011145.png|200

ื ื™ืชืŸ ืœืจืื•ืช ืฉื›ื™ื•ื•ืŸ ืฉ ID3 ืงื™ื‘ืœ ืจืง ืืช 2 ื”ืื™ื˜ืจืฆื™ื•ืช ื”ืื—ืจื•ื ื•ืช ืื– ื”ื•ื ื™ืกื›ื•ื ืืช ื”ืกื›ื•ื ื”ื—ืœืงื™ ืฉ i=8,9 ื•ืœื›ืŸ ื”ืคืœื˜ ื™ื”ื™ื” 17.

default

ื”ืจื’ืœ ืžื•ืžืœืฅ ื”ื•ื ืœื”ืฉืชืžืฉ ื‘ืžื™ืœืช ื”ืžืคืชื— default(none) ื›ื“ื™ ืœืงื‘ืœ ื‘ื“ื™ืงืช ืงื•ืžืคื™ืœืฆื™ื” ืฉืžื›ืจื™ื—ื” ืื•ืชื ื• ืœืจืฉื•ื ืขื‘ื•ืจ ื›ืœ ืžืฉืชื ื”, ืžื”ื™ ืจืžืช ื”ืฉื™ืชื•ืฃ ืฉืœื•. ื’ื•ืจื ืœืงื•ื“ ืœื”ื™ื•ืช ืžื•ื‘ืŸ ื™ื•ืชืจ.

Pasted image 20240309011436.png

Algorithms with OpenMP

ื—ื™ืฉื•ื‘ ืžืกืคืจื™ื ืจืืฉื•ื ื™ื™ื

ืงืœื˜: ืžืกืคืจ ื˜ื‘ืขื™ n
ืคืœื˜: ื›ืœ ื”ืžืกืคืจื™ื ื”ืจืืฉื•ื ื™ื™ื ืž 1 ืขื“ ืœ n.

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>

void calcPrimes(int n) {
    bool *prime = (bool *)malloc((n + 1) * sizeof(bool));
    if (prime == NULL) {
        printf("Memory allocation failed.\n");
        return;
    }

    // Initialize all elements of prime[] as true
    #pragma omp parallel for 
    for (int i = 0; i <= n; i++)
        prime[i] = true;

    // Mark non-prime numbers
    #pragma omp parallel for schedule(dynamic, 1)
    for (int p = 2; p * p <= n; p++) { 
            for(int d = 2; d<p && prime[p]; d++)
                 if(p % d == 0)
                     prime[p] = false;
    }

    // Print prime numbers
    printf("Prime numbers in the range 2 to %d are:\n", n);
    for (int p = 2; p <= n; p++) {
        if (prime[p])
            printf("%d ", p);
    }
    printf("\n");

    free(prime);
}
  1. ื”ืœื•ืœืื” ืฉืžื‘ืฆืขื™ื ืืชื—ื•ืœ ืœืžืขืจืš ื” prime ืชื”ื™ื” ืžืงื‘ื™ืœื™ืช
  2. ื”ืœื•ืœืื” ื”ื›ืคื•ืœื” ืฉืจืฆื” ืขื“ ื”ืฉื•ืจืฉ ืฉืœ n ื‘ืœื•ืœืื” ื”ื—ื™ืฆื•ื ื™ืช ื•ื‘ืœื•ืœืื” ื”ืคื ื™ืžื™ืช ืจืฆื™ื ืขืœ ื›ืœ ื”ืžืกืคืจื™ื ืขื“ ืœ p ื›ืืฉืจ ื”ืžืขืจืš ื‘ืžื™ืงื•ื ื” p ื”ื•ื true ื•ื‘ื•ื“ืงื™ื ื”ืื p ืžืชื—ืœืง ื‘ d ื•ืื ื›ืŸ ืžืฉื ื™ื ืืช ื”ืขืจืš ืฉืœ p ื‘ืžืขืจืš ื”ืจืืฉื•ื ื™ื™ื ืœื”ื™ื•ืช false. ืืช ื”ืœื•ืœืื” ื”ื–ืืช ื ืขืฉื” ื‘ืฆื•ืจื” ืžืงื‘ื™ืœื™ืช ืขื ืชื–ืžื•ืŸ ื“ื™ื ืžื™. ื”ืกื™ื‘ื” ืฉืœื ื”ืฉืชืžืฉื ื• ื‘ collapse ื”ื™ื ื‘ื’ืœืœ ืฉื”ืžืงื‘ื™ืœื™ื•ืช ืฉืœ ื”ืื™ื˜ืจืฆื™ื” ื”ืคื ื™ืžื™ืช ืขืœื•ืœ ืœื™ืฆื•ืจ ืจื™ืฆื•ืช ื›ืคื™ืœื•ืช ื›ื™ื•ื•ืŸ ืฉื›ืœ thread ื‘ื•ื“ืง ืืช ื”ืžืขืจืš ื”ื’ืœื•ื‘ืœื™ ืฉืœ ื”ืžืกืคืจื™ื ื”ืจืืฉื•ื ื™ื™ื ื•ืœื›ืŸ ืฉื ื™ threads ื™ื›ื•ืœื™ื ืœืงื‘ืœ ืžืกืคืจื™ื ืฉื›ืœ ืžื”ื ืžืฉื ื” ืืช ื”ืžืขืจืš prime ื‘ืื•ืชื• ื”ืžื™ืงื•ื ืื‘ืœ ืฉื ื™ื”ื ื™ื’ืฉื• ืœืขืจืš ื”ื–ื” ื‘ื’ืœืœ ืฉื”ื ืขื•ื‘ื“ื™ื ื‘ืžืงื‘ื™ืœ ื•ืœื ืžืกื•ื ื›ืจื ื™ื.
  3. ื”ืกื™ื‘ื” ืœืฉื™ืžื•ืฉ ื‘ dynamic ื”ื™ื ืฉื”ืื™ื˜ืจืฆื™ื•ืช ื”ืคื ื™ืžื™ื•ืช ื”ืŸ ืœื ื‘ื’ื•ื“ืœ ืฉื•ื•ื”, ืœื›ืŸ ืขื“ื™ืฃ ืœื ื• ืœืชื–ืžืŸ ืœืคื™ FCFS ื•ืœืชืช ืœthreads ืฉืžืกื™ื™ืžื™ื ืงื•ื“ื ืœื›ืŸ ืืช ื”ืื™ื˜ืจืฆื™ื•ืช ื”ืงืฆืจื•ืช ื™ื•ืชืจ ืœืงื‘ืœ ืขื‘ื•ื“ื•ืช ื—ื“ืฉื•ืช.
  4. ืื™ืŸ ื‘ืืžืช ืกื™ื‘ื” chunk size = 1 ืื• ืœืฉื™ื ืžืกืคืจ ืื—ืจ, ื›ื“ื™ ืœืงื‘ื•ืข ื‘ื™ื“ื™ื•ืง ืฆืจื™ืš ืœืขืฉื•ืช ื ื™ืกื•ื™ ื•ืชื”ื™ื™ื”.

BFS

ืจืื™ื ื• ื›ื‘ืจ ื“ืจืš ืœื—ืฉื‘ bfs ืžืงื‘ื™ืœื™ ืขื pthreads. ื›ืขืช ื ืจืื” ื›ื™ืฆื“ ื ื™ืชืŸ ืœื—ืฉื‘ ืืช ื–ื” ื‘ืฆื•ืจื” ืžืงื‘ื™ืœื™ืช ืขื OpenMP.

SEQUENTIAL_BFS(V, E, s)

   #pragma openmp parallel for
   for each ๐‘ฃโˆˆ๐‘‰
         ๐‘๐‘œ๐‘™๐‘œ๐‘Ÿ_๐‘ฃโ†๐‘คโ„Ž๐‘–๐‘ก๐‘’
         ๐‘‘_๐‘ฃโ†โˆž
         ๐œ‹_๐‘ฃโ†โˆ…
   ๐‘๐‘œ๐‘™๐‘œ๐‘Ÿ_๐‘ โ†๐‘”๐‘Ÿ๐‘’๐‘ฆ
   ๐‘‘_๐‘ โ†0
   define queue ๐‘„โ†{๐‘ }

   while ๐‘„โ‰ โˆ…
          ๐‘ฃโ†๐‘„.๐‘‘๐‘’๐‘ž๐‘ข๐‘’๐‘ข๐‘’( )
          ๐‘๐‘œ๐‘™๐‘œ๐‘Ÿ_๐‘ฃโ†๐‘๐‘™๐‘Ž๐‘๐‘˜
          #pragma openmp parallel for
          for each ๐‘ข adjacent to ๐‘ฃ
                 if ๐‘๐‘œ๐‘™๐‘œ๐‘Ÿ_๐‘ข is ๐‘คโ„Ž๐‘–๐‘ก๐‘’
                         #pragma omp critical
                                ๐‘„.๐‘’๐‘›๐‘ž๐‘ข๐‘’๐‘ข๐‘’(๐‘ข)
                         ๐‘๐‘œ๐‘™๐‘œ๐‘Ÿ_๐‘ขโ†๐‘”๐‘Ÿ๐‘’๐‘ฆ
                         ๐œ‹_๐‘ขโ†๐‘ฃ
                         ๐‘‘_๐‘ขโ†๐‘‘_๐‘ฃ+1

ื”ืฉื™ื˜ื” ื”ื–ืืช ื”ื™ื ื‘ืขื™ื™ืชื™ืช ื›ืคื™ ืฉืืžืจื ื• ื™ืฉ ืฆื•ืจืš ืœื”ื’ื“ื™ืจ ืชื•ืจ ืœื›ืœ ืฉื›ื‘ื” ืขืœ ืžื ืช ืœื”ืžื ืข ืžrace condition ืฉืขืœื•ืœ ืœื’ืจื•ื ืœืืœื’ื•ืจื™ืชื ืœืžืฆื•ื ืžืกืœื•ืœ ืืจื•ืš ื™ื•ืชืจ ืžื”ืžืกืœื•ืœ ื”ืงืฆืจ ื‘ื™ื•ืชืจ ืดื‘ื˜ืขื•ืชืด.

ื‘ OpenMP ื ื•ื›ืœ ืœื™ืฆื•ืจ ืชื•ืจ ืœื›ืœ ืจืžื” ืขื ืฉื ื™ ืชื•ืจื™ื ื‘ืœื‘ื“.

SEQUENTIAL_BFS(V, E, s)

   #pragma openmp parallel for
   for each ๐‘ฃโˆˆ๐‘‰
         ๐‘๐‘œ๐‘™๐‘œ๐‘Ÿ_๐‘ฃโ†๐‘คโ„Ž๐‘–๐‘ก๐‘’
         ๐‘‘_๐‘ฃโ†โˆž
         ๐œ‹_๐‘ฃโ†โˆ…
   ๐‘๐‘œ๐‘™๐‘œ๐‘Ÿ_๐‘ โ†๐‘”๐‘Ÿ๐‘’๐‘ฆ
   ๐‘‘_๐‘ โ†0
   define queue ๐‘„1โ†{๐‘ }
   define queue ๐‘„2โ†{ }

   while ๐‘„1โ‰ โˆ…
      #pragma openmp parallel for
      for each v in Q1    
          ๐‘๐‘œ๐‘™๐‘œ๐‘Ÿ_๐‘ฃโ†๐‘๐‘™๐‘Ž๐‘๐‘˜
          #pragma openmp parallel for
          for each ๐‘ข adjacent to ๐‘ฃ
                 if ๐‘๐‘œ๐‘™๐‘œ๐‘Ÿ_๐‘ข is ๐‘คโ„Ž๐‘–๐‘ก๐‘’
                         #pragma omp critical
                                ๐‘„2.๐‘’๐‘›๐‘ž๐‘ข๐‘’๐‘ข๐‘’(๐‘ข)
                         ๐‘๐‘œ๐‘™๐‘œ๐‘Ÿ_๐‘ขโ†๐‘”๐‘Ÿ๐‘’๐‘ฆ
                         ๐œ‹_๐‘ขโ†๐‘ฃ
                         ๐‘‘_๐‘ขโ†๐‘‘_๐‘ฃ+1
	    empty Q1
        swap Q1 and Q2

Bottom-up BFS algorithm

ื’ื™ืฉื” ืฉื”ื•ืฆื’ ื‘Direction Optimizing BFS ื”ืฆื™ื’ื” ื’ื™ืฉื” ืฉืžืชืื™ืžื” ื™ื•ืชืจ ืœืชื›ื ื•ืช ืžืงื‘ื™ืœื™. ื”ื’ื™ืฉื” ื”ื–ืืช ืื•ืžืจืช ืฉื ื—ื–ื™ืง ืžืขืจืš ืฉืœ frontier ื•ื‘ื›ืœ ืื™ื˜ืจืฆื™ื” bottom-up ื‘ืžืงื•ื ืœืจื•ืฅ ืขืœ ื›ืœ ื”ืฉื›ื ื™ื ืฉืœ ืงื•ื“ืงื•ื“ vโˆˆfrontier , ื ืจื•ืฅ ืขืœ ื›ืœ ื”ืงื•ื“ืงื•ื“ื™ื ืฉืœื ื”ื•ื’ื“ืจ ืœื”ื parent ื‘ืจื’ืข ืฉืžืฆืื ื• ื›ื–ื” ืจืฆื™ื ืขืœ ื›ืœ ื”ืฉื›ื ื™ื ืฉืœื• ืฉื ืžืฆืื™ื ื‘frontier ื•ืžื‘ืฆืขื™ื ืขืœื™ื”ื BFS. ื‘ืžื™ืœื™ื ืื—ืจื•ืช, ื ื‘ื“ื•ืง ืขื‘ื•ืจ ื›ืœ ื”ืงื•ื“ืงื•ื“ื™ื ื”ืื ื™ืฉ ืœื”ื ืฉื›ืŸ ืฉื”ื•ื ื‘ืจืžื” ื”ื ื•ื›ื—ื™ืช ืฉืฆืจื™ืš ืœืกืจื•ืง.

ื”ืกื™ื‘ื” ืฉื‘ืฆื•ืจื” ื”ืžืงื‘ื™ืœื™ืช ื”ืฉื™ื˜ื” ื”ื–ืืช ื™ื•ืชืจ ื™ืขื™ืœื” ื”ื™ื ื‘ื’ืœืœ ืฉืื ื—ื ื• ืžื•ื ืขื™ื Race conditions ืจื‘ื™ื ืœืื•ืจ ื”ืขื•ื‘ื“ื” ืฉื›ืœ thread ื ื™ื’ืฉ ืœืžืขืจืš parent ื‘ืื™ื ื“ืงืก ืฉืžื–ื•ื”ื” ืขื ื”ืงื•ื“ืงื•ื“ ืฉืขืœื™ื• ื”ื•ื ืขื•ืžื“. ื ื•ื›ืœ ื’ื ืœื”ืžื ืข ืžืกื ื›ื•ืŸ ืฉืœ ืงื‘ื•ืฆืช ื” next ื ื•ื›ืœ ืœืžืžืฉ ืืช ื–ื” ื‘ืืžืฆืขื•ืช ืžืขืจืš ืขื•ืžืงื™ื ื•ื›ื›ื” ื›ืœ thread ื ื™ื’ืฉ ืจืง ืœื›ืชื•ื‘ืช ื”ืžืชืื™ืžื” ืœืงื“ื•ืงื•ื“ ื”ืžืชืื™ื ืขื‘ื•ืจื•.

ื”ืืœื’ื•ืจื™ืชื bottom-up ื‘ืฆื•ืจื” ืกื“ืจืชื™ืช ื”ืจื‘ื” ืคื—ื•ืช ื˜ื•ื‘ ืžื”ื’ื™ืฉื” ื”ืจื’ื™ืœื” ืฉืื ื—ื ื• ืžื›ื™ืจื™ื.
ื”ืžื™ืžื•ืฉ ื”ืกื“ืจืชื™ ื ืจืื” ื›ืš:
Pasted image 20240309184052.png|300
Pasted image 20240309184021.png|300

ื‘ืฆื•ืจื” ืžืงื‘ื™ืœื™ืช , ื ื•ื›ืœ ืœืืชื—ืœ ืžืขืจืš ื”parents ื‘ืฆื•ืจื” ืžืงื‘ื™ืœื™ืช ื•ื’ื ื‘ืฆื•ืจื” ื“ื™ื ืžื™ืช ืœืจื•ืฅ ืขืœ ื›ืœ ื”ืงื•ื“ืงื•ื“ื™ื (ื”ืกื™ื‘ื” ืฉื”ืฆื•ืจื” ื”ื™ื ื“ื™ื ืžื™ืช ื”ื™ื ื‘ื’ืœืœ ืฉืื ื—ื ื• ืœื ื™ื•ื“ืขื™ื ืื™ื–ื” ืงื•ื“ืงื•ื“ื™ื ื”ื ืขื ื™ื•ืชืจ ืฉื›ื ื™ื ื•ืื™ื–ื” ืขื ืคื—ื•ืช). ื”ืœื•ืœืื” ื”ืฉื ื™ื™ื” ื›ื‘ืจ ื™ื›ื•ืœื” ืœื”ื™ื•ืช ืกื˜ื˜ื™ืช ืฉื›ืŸ ื›ืžื•ืช ื”ืขื‘ื•ื“ื” ื”ื™ื ืฉื•ื•ื” ื‘ื›ืœ ืื™ื˜ืจืฆื™ื” ืคื ื™ืžื™ืช. ื›ืžื• ื›ืŸ ื ืฉื™ื ืœื‘ ืฉืื™ืŸ ื›ืืŸ ืื™ื–ื” ืกื ื›ืจื•ืŸ ืคืจื˜ ืœืกื ื›ืจื•ืŸ ื”ื“ื™ืคื•ืœื˜ื™, ื‘ืžืงื•ื ืฉื ืขื‘ื•ื“ ืขื ืฉื ื™ ืชื•ืจื™ื ืขื counter ืฉื”ื•ืคืš ืœื”ื™ื•ืช 0 ืจืง ื›ืืฉืจ ื” bottom-up ืœื ืžื•ืฆื ืฉื•ื ืงื•ื“ืงื•ื“ ืฉื”ื”ื•ืจื” ืฉืœื• ืดืœื ืงื™ื™ืืด ื‘ืžืขืจืš ื”parent.

Pasted image 20240309183913.png

DFS

ื ืจืื” ืงื•ื“ c ืฉืžืžืฉ DFS ืžืงื‘ื™ืœื™ ื‘ืืžืฆืขื•ืช OMP:

#include <stdio.h>
#include <stdlib.h>
#include <omp.h>

#define MAX_NODES 100

int graph[MAX_NODES][MAX_NODES];
int num_nodes;

enum colors {NOT_VISITED, VISIT_IS_IN_TASK_QUEUE, ALREADY_VISITED};
enum colors visited[MAX_NODES];

void parallel_dfs_visit(int node) {
    visited[node] = ALREADY_VISITED;
    printf("Node %d is visited by thread %d\n", node, omp_get_thread_num());

    for (int i = 0; i < num_nodes; i++) {
        if (graph[node][i]) {
            #pragma omp critical 
            {
                if (visited[i] == NOT_VISITED) {
                    printf("thread %d: create task for node %d\n", omp_get_thread_num(),i);
                    visited[i] = VISIT_IS_IN_TASK_QUEUE;
                    #pragma omp task
                        parallel_dfs_visit(i);
                }
            }
        }
    }
}

void dfs() {
    for (int i = 0; i < num_nodes; i++) {
        printf("thread %d: examine node %d\n", omp_get_thread_num(), i);
        if (visited[i] == NOT_VISITED) {
            printf("\nstarting DFS visit from %d\n", i);
            
            #pragma omp parallel
            {
                #pragma omp single
                {
                    parallel_dfs_visit(i);
                }
            }
            
        } else if (visited[i] == VISIT_IS_IN_TASK_QUEUE) {
            perror("error: VISIT_IS_IN_TASK_QUEUE shouldn't be possible here\n");
        }
    }
}

ื”ืงื•ื“ ืžื›ื ื™ืก thread ืื—ื“ ืœืชื•ืš ื” dfs_visit ื•ื”ื•ื ื›ืœ ืคืขื ืžื•ืกื™ืฃ task ืœืคื•ื ืงืฆื™ื™ืช ื”visit ื”ืžืงื‘ื™ืœื™ืช. ื›ื›ื” ื‘ืขืฆื ืื ื—ื ื• ืžื‘ืฆืขื™ื ืกืจื™ืงื” ืขื•ืžืง ืื‘ืœ ื›ืœ ื”ืฉื›ื ื™ื ืฉืœ ืงื•ื“ืงื•ื“ ื›ืœืฉื”ื• ืขื•ืฉื™ื ืืช ื–ื” ืžืงื‘ื™ืœื™ (ืงืฆืช ืขื™ื•ื•ืช ืฉืœ DFS ืฉืžืฉืœื‘ ื‘ืชื•ื›ื• BFS). ืฆืจื™ืš ืœืฉื™ื ืœื‘ ืฉื’ื ืขื ื”ืขื™ื•ื•ืช ื”ื–ื” ืžื–ื›ื™ืจ BFS , ื›ืœื•ืžืจ ื”ื™ื™ื ื• ื™ื›ื•ืœื™ื ืื•ืœื™ ืœื‘ื ื•ืช ืžื ื’ื ื•ืŸ ืžื‘ื•ืกืก priorities ืฉื ื•ืชืŸ ืœ task ืืช ื”ืขื“ื™ืคื•ืช ืฉืœ ื”ืจืžื” ืฉืœ ืื•ืชื• ืงื•ื“ืงื•ื“ ื‘ื’ืจืฃ. ืื‘ืœ ืขื“ื™ื™ืŸ ื–ื” ืœื ื”ื™ื” BFS ืžื•ืฉืœื ืฉื›ืŸ ื™ื›ืœื• ืœื”ื™ื•ืช ืžืฆื‘ื™ื ืฉื”task queue ื™ืงืฆื” ืžืฉื™ืžื•ืช ืขื ืขื•ืžืง ื ืžื•ืš ื™ื•ืชืจ ื›ื™ ื™ืฉ threads ื‘ืื•ื•ื™ืจ, ืœืคื ื™ ืฉื›ืœ ื”ืจืžื” ื”ืกืชื™ื™ืžื”.

task priority

ืืคืฉืจ ืœื”ื‘ื™ื ืœ task ืขืจืš ืฉืœ priority ื‘ืขืช ื”ื™ืฆื™ืจื” ืฉืœื• ืขื task priority(#p) ื‘ pragma block.

Warning

ื”ืžื™ืžื•ืฉ ื”ื ืดืœ ืื™ื ื• ืžื•ืฉืœื ืฉื›ืŸ ื”ื•ื ื ื•ืขืœ ืืชDFS visit ืขื critical block , ื™ืฉ ืœื›ืš ื”ืฉืคืขื•ืช ืฉืœ ืคื’ื™ืขื” ื‘ืžืงื‘ื™ืœื™ื•ืช ืขืœื™ื”ื ื ื“ื‘ืจ ื‘ื”ืžืฉืš

Sections

ื‘ืขื™ื™ืช producer/consumer ืžื•ื’ื“ืจืช ื‘ืื•ืคืŸ ื”ื‘ื. ื™ืฉ ืœื ื• bounded buffer ื‘ื’ื•ื“ืœ N ืชืื™ื. producer ืžื™ื™ืฆืจ items ื•ืฉื ืื•ืชื ื‘- buffer ื‘ืžืงื•ื ืคื ื•ื™. consumer ืฆื•ืจืš items ื•ืžืคื ื” ืžืงื•ื ื‘- buffer. ืขืœ consumer ืœื—ื›ื•ืช ื›ื“ื™ ืฉ- item ื™ื›ื ืก ืœ- buffer ืœืคื ื™ ืฉื™ื•ื›ืœ ืœืฆืจื•ืš ืื•ืชื•, ื•ืขืœ producer ืœื—ื›ื•ืช ืขื“ ืฉื™ื”ื™ื” ืžืงื•ื ืคื ื•ื™ ื‘- buffer ื›ื“ื™ ืฉื™ื•ื›ืœ ืœืฉื™ื ืฉื item ื—ื“ืฉ.

ืงื•ื“ ืกื“ืจืชื™ ื™ืจืื” ืžื”ืฆื•ืจื” ืฉื‘ื” ื‘ื›ืœ ืื™ื˜ืจืฆื™ื” ื ื•ืฆืจ item ืื—ื“, ื•ื”ื•ื ืžื™ื“ ื ืฆืจืš.
Pasted image 20240309012234.png

ืื™ืš ืœื”ืคื•ืš ืื•ืชื” ืœืชื•ื›ื ื™ืช ืžืงื‘ื™ืœื™ืช ื›ื“ื™ ืฉ- producer ื•- consumer ื™ื•ื›ืœื• ืœืขื‘ื•ื“ ื›ืœ ืื—ื“ ื‘ื ืคืจื“ ื•ื‘ืžืงื‘ื™ืœ ? ืขื ื”ื›ืœื™ื ืฉืœืžื“ื ื• ื‘ OpenMP ืขื“ ื›ื”, ื ืจืื” ืฉืœื ืงื™ื™ื ื›ืœื™ ืฉื™ืขื–ื•ืจ ืœื ื• ืฉื›ืŸ ืื ื—ื ื• ืœื ืจื•ืฆื™ื ืœื—ืœืง ืžืฉื™ืžื” ืœื—ืœืงื™ื ืืœื ืœื”ื’ื“ื™ืจ ืฉืชื™ ืžืฉื™ืžื•ืช ืฉื•ื ื•ืช ืขื‘ื•ืจ ื›ืœ thread. ื–ื” ื‘ืขื™ื” ืฉืœื ื ืชืงืœื ื• ื‘ื” ืขื“ ื›ื” ื›ื™ ืื ื—ื ื• ืจื•ืฆื™ื ืœื—ืœืง ืชืคืงื™ื“ื™ื ื•ืœื ืื™ื˜ืจืฆื™ื•ืช.

Pasted image 20240309125402.png|300

ื”ื›ืœื™ ื”ืžื“ื•ื‘ืจ ื ืงืจื sections:
Pasted image 20240309013048.png

  1. ืฉื ื™ keyword ื—ื“ืฉื™ื - section ื• sections.
  2. ื›ืœ ื” sections ืจืฆื™ื ื‘ืžืงื‘ื™ืœ ื‘thread ื ืคืจื“ ืœื›ืœ section
  3. ื”ืชื•ื›ืŸ ืฉืœ ื›ืœ section ื”ื•ื ืกื“ืจืชื™ ืืœื ืื ื›ืŸ ืžื’ื“ื™ืจื™ื ืื—ืจืช
  4. ื”ื’ื“ืจื ื• section ืžืงื‘ื™ืœื™ ืขืฆืžืื™ ืขื‘ื•ืจ producer thread ื•- consumer thread. ืขื‘ื•ืจ ื›ืœ ืชื A[i] ื™ืฆืจื ื• ืžืฉืชื ื” flags[i] ืขื‘ื•ืจ ื›ืœ ืชื ื‘ืžืขืจืš A ื›ื“ื™ ืฉ- producer ื™ื•ื›ืœ ืœื™ื™ื“ืข ืืช consumer ืฉื”ื•ื ืžื™ืœื A[i] ื•ืชื ื–ื” ืžื•ื›ืŸ ืœืฆืจื™ื›ื”, ื•- consumer ื™ื•ื›ืœ ืœื™ื™ื“ืข ืืช producer ืฉื”ื•ื ืฆืจืš A[i] ื•ืชื ื–ื” ืžื•ื›ืŸ ืœืžื™ืœื•ื™ ื—ื•ื–ืจ.

ื‘ืงื•ื“ ื”ื ืดืœ ื™ืฉื ื” ื‘ืขื™ื”, ื”ืคืœื˜ ืขืœื•ืœ ืœื”ื™ื•ืช ืฉื•ื ื” ื‘ื›ืœ ืจื™ืฆื” ืžื›ื™ื•ื•ืŸ ืฉืฉื ื™ threads ืขืœื•ืœื™ื ืœืขื‘ื•ื“ ืขืœ ืฉื ื™ ืžืขื‘ื“ื™ื ืฉื•ื ื™ื ื•ืœื›ืŸ ื”ืขื“ื›ื•ืŸ ืฉืœ A[i] ืขืœ ื™ื“ื™ producer threads ืขืœื•ืœ ืœื”ื™ื•ืช ืœื ื ื’ื™ืฉ ืขื‘ื•ืจ consumer thread ื•ื›ื ืดืœ ืœื’ื‘ื™ FLAGS[i] . ื–ื” ื›ืžื•ื‘ืŸ ืชืœื•ื™ ื‘ืžื“ื™ื ื™ื•ืช ื”ื›ืชื™ื‘ื” ืฉืœ ื”cache ืื‘ืœ ื–ื” ืขื“ื™ื™ืŸ ื™ื•ืฆืจ ืื™ ื“ื˜ืจืžื™ื ื™ื–ื.

ื›ื“ื™ ืœื˜ืคืœ ื‘ื‘ืขื™ื” ื”ื–ืืช ื ื™ืชืŸ ืœื”ืฉืชืžืฉ ื‘ืคื•ื ืงืฆื™ื” flush
Pasted image 20240309130114.png

ื”ืคื•ื ืงืฆื™ื” ื”ื–ืืช ื‘ืขืฆื ืžื›ืจื™ื—ื” ืืช ื”ืžื™ื“ืข ืœื”ืชืขื“ื›ืŸ ื’ื ื‘ram ื•ืœื ืจืง ื‘ cache. ื ืฉื™ื ืœื‘ ืฉื–ื” ืœื ื”ืžืฆื‘ ืฉืœ cache invalidation ืฉื“ื™ื‘ืจื ื• ืขืœื™ื• ืงื•ื“ื, ืžืฆื‘ ืฉืœ cache invalidation ื’ื•ืจื ืœื˜ืขื™ื ื” ืฉืœ ื”ืžื™ื“ืข ืžื”RAM ื•ืœื ืœืขื“ื›ื•ืŸ ื”ืžื™ื“ืข ื‘RAM.

ื›ื›ื” ื ื‘ื˜ื™ื— ืืช ืืžื™ื ื•ืช ื”ืžื™ื“ืข ื‘ cache, ืื‘ืœ ืขื“ื™ื™ืŸ ื”ืคืœื˜ ืขืœื•ืœ ืœื”ื™ื•ืช ืฉื•ื ื” ื‘ื›ืœ ื”ืจืฆื” ื‘ื’ืœืœ ืฉื™ื›ื•ืœ ืœื”ื™ื•ืช ืžืฆื‘ ืฉืœืคื ื™ ืฉื ื’ื™ืข ืœ flush ื”ื–ื›ืจื•ืŸ ืขื‘ื•ืจ flags ื™ืชืขื“ื›ืŸ ืื‘ืœ ืขื‘ื•ืจ ื”ืžืขืจืš ืœื, ืœื›ืŸ ื”consumer ื™ืฆื ืžื”busy wait ืฉืœื• (ืœื•ืœืืช ื” while) ื•ื™ืกื›ื•ื ืขืจืš ืœื ืขื“ื›ื ื™.

ื”ืคืชืจื•ืŸ ื”ื•ื ืœื”ื•ืกื™ืฃ ืขื•ื“ flush ื’ื ืื—ืจื™ ืขื“ื›ื•ืŸ ืฉืœ A ื•ื’ื ืื—ืจื™ ืขื“ื›ื•ืŸ ืฉืœ flags

Pasted image 20240309132141.png

ืฆืจื™ืš ืœืฉื™ื ืœื‘ ืฉื‘ืžื™ืœื™ื ืื—ืจื•ืช ื”ืคืขื•ืœื” ื”ื–ืืช ืื•ืžืจืช ืœืœื™ื‘ื•ืช ืœื ืœืขื‘ื•ื“ ืขื ื”cache ื‘ื›ืœืœ ืืœื ืจืง ืขื RAM ื•ืœื›ืŸ ื–ืืช ืคืขื•ืœื” ืžืื•ื“ ื™ืงืจื” ื•ืœื•ืงื—ืช ื–ืžืŸ ืจื‘.

Nested Sections

ื ื•ื›ืœ ืœื”ืฆื”ื™ืจ ืขืœ nested section ื‘ืชื•ืš section ืงื™ื™ื. ื›ืžื• nested for , ื” nested section ื‘ืื•ืคืŸ ื“ื™ืคื•ืœื˜ื™ ืจืฆื™ื ื‘ืฆื•ืจื” ืกื“ืจืชื™ืช ืขืœ ื™ื“ื™ ื”thread ืฉืžืจื™ืฅ ืืช ื”section ืื‘.

Pasted image 20240309170140.png

ืื ื ืจืฆื” ืœืฉื ื•ืช ืืช ื”ื”ืชื ื”ื’ื•ืช ื”ื–ืืช ื ืฉืชืžืฉ ื‘ omp_set_nested(1) ืžื—ื•ืฅ ืœ pragma block ื›ื“ื™ ืœืืคืฉืจ ืžืงื‘ื™ืœื™ื•ืช ืžืงื•ื ื ืช.

Tasks

Linked List

ื ืจืฆื” ืœื”ืจืื•ืช ื›ื™ืฆื“ ืœื”ืฉืชืžืฉ ื‘ OpenMP ื›ื“ื™ ืœืžืงื‘ืœ ืืช ื”ืงื•ื“ ื”ื‘ื

Pasted image 20240309170403.png|300

ื”ื”ืฆืขื” ื”ืื™ื ื˜ื•ืื™ื˜ื™ื‘ื™ืช ื›ื“ื™ ืœืžืงื‘ืœ ืœื•ืœืืช while , ื“ื‘ืจ ืฉืœื ืžืชืืคืฉืจ ื‘ default ื”ื™ื ืœื”ืžื™ืจ ืืช ื”ืจืฉื™ืžื” ื”ืžืงื•ืฉืจืช ืœืžืขืจืš ื•ืื– ืขืœ ื–ื” ืœืจื•ืฅ ื‘ืฆื•ืจื” ืžืงื‘ื™ืœื™ืช.

Pasted image 20240309170532.png
ื™ืฉ ืคื” overhead ืฉืœ ื”ืขืชืงื”. ืœื”ืฉืชืžืฉ ื‘ sections ื”ื•ื ื’ื ื‘ืขื™ื™ืชื™ ื›ื™ ืื ื—ื ื• ืœื ื™ื›ื•ืœื™ื ืœื“ืขืช ืžื”ื• ืžืกืคืจ ื”sections ืฉื ืจืฆื” ืœื™ื™ืฆืจ. ื ืจืฆื” ื›ืœื™ ืฉืžืชืื™ื ืœืžืงื‘ื•ืœ ืฉืœ ืžืฆื‘ ื›ื–ื” ืฉื‘ื• ืœื ื™ื•ื“ืข ื›ืžื” tasks ื ืจืฆื” ืœื™ื™ืฆืจ, ื”ื›ืœื™ ื”ื–ื” ื ืงืจื Tasks.

ื ืกืชื›ืœ ืขืœ ื”ืกื™ื ืชืงืก:
Pasted image 20240309170726.png|300
ื”ื”ื‘ื“ืœ ื”ืžื”ื•ืชื™ ื‘ื™ืŸ tasks ืœ sections ื”ื•ื ืฉื›ืืŸ ื™ืฉ task queue ืฉืžื ื”ืœ ืืช ื”ืžืฉื™ืžื•ืช. ื‘ืชื•ืš ื”ืงื•ื“ ื”ืžืงื‘ื™ืœื™ ืขื˜ืคื ื• ืืช ื”ืชื•ื›ื ื™ืช ื‘ืงื•ื“ single ื›ื“ื™ ืฉืจืง thread ืื—ื“ ื™ื™ืฆืจ ืืช ื”tasks ื•ื™ื›ื ื™ืก ืื•ืชื ืœ task pool.

ื›ืืŸ ื™ืฉ ืคื” ืžื™ืžื•ืฉ ื“ื•ืžื” ืžืื•ื“ ืœThread Pool , ื” threads ืฉื•ืœืคื™ื ืžืฉื™ืžื•ืช ืžื”task queue ื›ืœ ืขื•ื“ ื”ื ื™ื›ื•ืœื™ื.

Pasted image 20240309171008.png
ืขื‘ื•ืจ ื”ืชื•ื›ื ื™ืช ื”ื ืดืœ ื”ืคืœื˜ ื™ื”ื™ื”

Pasted image 20240309171020.png

ื‘ื”ืจืฆื” ื”ื–ื• ืจื•ืื™ื ืฉ- thread2 ื”ืกืคื™ืง ืœื‘ืฆืข 3 tasks, ื•ื›ืœ ืฉืืจ ื”- threads ืจืง task ืื—ื“. ื–ื” ืชืœื•ื™ ื‘ืชื–ืžื•ืŸ ืฉืœ threads ื•ื‘ืžืจื•ื›ื‘ื•ืช ืฉืœ tasks ืื ื”ื ืฉื•ื ื™ื.

ื ื™ืชืŸ ื’ื ืœื”ื’ื“ื™ืจ explicit barriers ืœ tasks ื›ืœื•ืžืจ, ื ื™ืชืŸ ืœื—ื›ื•ืช ืœืžืกืคืจ tasks ืฉื™ืกืชื™ื™ืžื• ืœืคื ื™ ืฉืžื›ื ื™ืกื™ื tasks ื—ื“ืฉื™ื:

Pasted image 20240309171244.png|300

ืื ื ืจืฆื” ืœืžืžืฉ ืืช ื–ื” ืขืœ ื”linked list ืฉืœื ื• ื ืจืฆื” ืœื”ื‘ื™ืŸ ืžื” ื”ืžื’ื‘ืœื•ืช ืฉืœื ื ื™ืชืŸ ืœื”ืžื ืข ืžื”ื ื•ืžื” ืื ื—ื ื• ื™ื›ื•ืœื™ื ืœืขืฉื•ืช.

  1. ื—ื™ื™ื‘ื™ื ืœืจื•ืฅ ืขืœ ื›ืœ ื”ืจืฉื™ืžื” ื‘ืื•ืคืŸ ืžืกื•ื ื›ืจืŸ ื›ืœื•ืžืจ thread 1 ืื• ืขื ื›ืœื™ ืกื ื›ืจื•ืŸ.
  2. ืฆืจื™ืš ืœื”ืฉืชืžืฉ ื‘ืœื•ืœืืช while ื›ื™ ื’ื•ื“ืœ ื”ืจืฉื™ืžื” ืœื ื™ื“ื•ืข.
  3. ื›ืœ process ื™ื›ื•ืœ ืœื”ื™ื•ืช ืžืงื‘ื™ืœื™.

ืื ื›ืŸ ื”ืžื•ื“ืœ ื™ื™ืจืื” ืžื”ืฆื•ืจื”

Pasted image 20240309171705.png|300

ื”ืงื•ื“ ืžืื•ื“ ืคืฉื•ื˜ ื•ื™ืจืื” ื›ืš:
Pasted image 20240309171223.png|300

ื‘ืžื™ืœื™ื- thread ืื—ื“ ืžื›ื ื™ืก ืืช ื”tasks , ื›ืœ task ื”ื•ื ื”ืคืขื•ืœื” process ืขืœ ืื™ื‘ืจ ื›ืœืฉื”ื• ื‘ืจืฉื™ืžื”. ืฉืืจ ื” threads ืฉื•ืœืคื™ื ืืช ื”tasks ื•ืžื‘ืฆืขื™ื ืื•ืชื ืžืงื‘ื™ืœื™.

Fibonacci

ืคื™ื‘ื•ื ืืฆืณื™ ื”ืคืจื“ ื•ืžืฉื•ืœ ื”ื•ื ืืœื’ื•ืจื™ืชื ืฉืคื•ืชืจ ืืช ืคื™ื‘ืณ ื‘ื–ืžืŸ O(logโกn).
ืื ื—ื ื• ืžื›ื™ืจื™ื ื’ื ืืœื’ื•ืจื™ืชื ืคืฉื•ื˜ ื™ื•ืชืจ ืฉืคื•ืชืจ ื‘ืฆื•ืจื” ืจื™ืงื•ืจืกื™ื‘ื™ืช ื‘ื–ืžืŸ ืžืื•ื“ ืื™ื˜ื™

Pasted image 20240309172333.png|300

ืื ื ืจืฆื” ืœืžืงื‘ืœ ืืช ื–ื” ื ื•ื›ืœ ืœื—ืฉื•ื‘ ืขืœ ืœื”ื›ื ื™ืก ื›ืœ ืงืจื™ืื” ืจืงื•ืจืกื™ื‘ื™ืช ืœ task.

Pasted image 20240309172420.png|300

ื”ืชื•ื›ื ื™ืช ื”ื–ืืช ืžืื•ื“ ืœื ื™ืขื™ืœื” ืฉื›ืŸ ืื ื—ื ื• ืžืฉืงื™ืขื™ื ื”ืžื•ืŸ ื–ืžืŸ ื•ืžืงื•ื ื‘ื™ืฆื™ืจืช 2n tasks ื‘ queue.

Pasted image 20240309172525.png

ื ื•ื›ืœ ืœื”ื’ื“ื™ืจ ืงื•ื“ ืฉืžื‘ืงืจ ืืช ื›ืžื•ืช ื”tasks ื•ืžื•ื ืข ืžื”task queue ืœื”ื›ื™ืœ ื›ืžื•ืช ื’ื“ื•ืœื” ื™ื•ืชืจ ืฉืœ ืžืฉื™ืžื•ืช ืž capacity ืฉื ื’ื“ื™ืจ ืžืจืืฉ

Pasted image 20240309172814.png|300

Pasted image 20240309172825.png

ื”ืงื•ื“ ื”ื–ื” ื‘ืขืฆื ืžื’ื“ื™ืจ ืฉืื—ืจื™ ืขื•ืžืง ืขืฅ ืจืงื•ืจืกื™ื” ืฉืœ 5 ื”ืงื•ื“ ื™ืขื‘ื•ืจ ืœื—ืฉื‘ ืืช ืฉืืจ ื”ืขื•ืžืงื™ื ื‘ืฆื•ืจื” ืกื“ืจืชื™ืช, ื›ืœ ืขื•ืžืง ืขืฅ ื ืžื•ืš ืžื–ื” ื™ืขืฉื” ื‘ืฆื•ืจื” ืžืงื‘ื™ืœื™ืช.

Synchronization

ื›ื›ืœืœ ืืฆื‘ืข ื ืจืฆื” ืœื”ืžื ืข ืžืกื ื›ืจื•ืŸ ื›ื“ื™ ืœื”ืขืœื•ืช ืจืžืช ื”ืžืงื‘ื™ืœื•ืช.

Pasted image 20240310162652.png

ื™ืฉ ืฉื ื™ ืกื•ื’ื™ ืกื ื›ืจื•ืŸ ื‘ openMP:

High level synchronization

  1. critical
  2. barriers

Advanced synchronization operations

  1. ordered
  2. locks
  3. atomic
  4. flush

Ordered

ื”ืงื•ื“ ืฉื›ืชื•ื‘ ื‘ืชื•ืš ordered block ืžื‘ื˜ื™ื— ืฉื’ื ืื ื”ืจื™ืฆื” ืชื”ื™ื” ืžืงื‘ื™ืœื™ืช ื”ืกื“ืจ ืฉื‘ื• ื”ืงื•ื“ ื™ื‘ื•ืฆืข ื™ื”ื™ื” ืœืคื™ ื”ืกื“ืจ ืฉื‘ื•ื ื”ืงื•ื“ ืžื•ืคื™ืข ื‘ parallel region ืœืžืฉืœ ื‘ื“ื•ื’ืžื” ืขื ืœื•ืœืื”


do_some_work(i) {
     printf("work% in progressโ€ฆ\n", i)
     return i
}

main() {
   #pragma omp parallel
   {
           #pragma omp for ordered
           for i = 0 to 4
               result = do_some_work(i)
               #pragma omp ordered
                     printf("work%d: %d\n", result)
   }
}

ื”ืคืœื˜ ื™ื”ื™ื”
Pasted image 20240310163128.png|200

ื ื™ืชืŸ ืœืจืื•ืช ืฉื›ืœ ื—ื™ืฉื•ื‘ ื ืขืฉื” ื‘ืฆื•ืจื” ืžืงื‘ื™ืœื™ืช ืขืœ ื™ื“ื™ ื”threads ืื‘ืœ ื”ื—ืœืง ืฉืœ ื” ordered ื ืขืฉื” ืœืคื™ ืกื™ื“ื•ืจ ืกื“ืจืชื™ ืฉืœ ื”ืื™ื˜ืจืฆื™ื•ืช.

ื“ื•ื’ืžื” ื ื•ืกืคืช ื™ื›ื•ืœื” ืœื”ื™ื•ืช ืขืœ sections

#pragma omp parallel
{
    #pragma omp sections
    {
        #pragma omp section
        {
            #pragma omp ordered
            printf("Thread %d is executing section 1.\n", omp_get_thread_num());
        }
        #pragma omp section
        {
            #pragma omp ordered
            printf("Thread %d is executing section 2.\n", omp_get_thread_num());
        }
        #pragma omp section
        {
            #pragma omp ordered
            printf("Thread %d is executing section 3.\n", omp_get_thread_num());
        }
    }
}

ื›ืืŸ ื”sections ืจืฆื™ื ื‘ืžืงื‘ื™ืœ ืื‘ืœ ื”ื”ื“ืคืกื•ืช ื™ื”ื™ื• ืœืคื™ ืกื“ืจ ื”ื•ืคืขืชืŸ ื‘ืชื•ืš ื” pragma block.

Pasted image 20240310163445.png

Critical

ื“ื™ื‘ืจื ื• ืขืœ critical ื›ื‘ืœื•ืง ื”ืžืืคืฉืจ ืจืง ืœ thread ืื—ื“ ืœื’ืฉืช ืœืงื˜ืข ืงื•ื“ ืงืจื™ื˜ื™. ืขื ื–ืืช, ื™ืฉ ื›ืžื” ื“ื’ืฉื™ื ืฉืœื ื“ื™ื‘ืจื ื• ืขืœื™ื”ื, ืœืžืฉืœ, ืื ืื ื™ ืœื ื ื•ืชืŸ ืฉื ืœcritical sections ืฉืœื™ ื‘ืชื•ืš ื”ื‘ืœื•ืง ืื– ื™ืฉ ืจืง ืžื ืขื•ืœ ืื—ื“ ืฉืื—ืจืื™ ืขืœื™ื”ื ื›ืœื•ืžืจ ืื ื ืขืœืชื™ ืฉืชื™ ืื–ื•ืจื™ ืงื•ื“ ืงืจื™ื˜ื™ื ืขื ื”ืงื•ื“ pragma omp critical ืื– ืจืง thread ืื—ื“ ื™ื•ื›ืœ ืœื’ืฉืช ื‘ื›ืœ ืคืขื ืœืื—ื“ ืžืงื˜ืข ื”ื‘ืœื•ืง ื”ืœืœื•.

Pasted image 20240310163820.png|250

ืืคืฉืจ ืœืชืงืŸ ื–ืืช ืขืœ ื™ื“ื™ ืžืชืŸ ืฉืžื•ืช ืœ critical sections ื›ื“ื™ ืœืชืช ืœื›ืœ ืื—ื“ ืžื ืขื•ืœ ืฉื•ื ื”

Pasted image 20240310163901.png|250

ื ืฉื™ื ืœื‘ ืฉ Critical ื”ื•ื ืžื ื’ื ื•ืŸ ื ืขื™ืœื” ื—ื–ืง ืžื“ื™ ื‘ืขื™ืงืจ ืžื›ื™ื•ื•ืŸ ืฉื”ื•ื compiler directive ื•ืœื›ืŸ ื”ืจื‘ื” ืคืขืžื™ื ืื™ืŸ ืœื ื• ืืช ื”ื™ื›ื•ืœืช ืœื‘ืฆืข ื ืขื™ืœื•ืช ื“ื™ื ืžื™ื•ืช ื‘ื”ืชืื ืœืžืฆื‘ ื”ืชื•ื›ื ื™ืช. ื”ื™ื›ื•ืœืช ื”ื™ื—ื™ื“ื” ืฉืœื™ ื”ื™ื ืœืชืช ืฉืžื•ืช hardcoded ืœืžื ืขื•ืœื™ื ืžื” ืฉืขืœื•ืœ ืœื™ืฆื•ืจ ืžืฆื‘ื™ื ืฉืœ ื ืขื™ืœื” ืฉืœื ืœืฆื•ืจืš.

Pasted image 20240310164126.png|250

ืœืžืฉืœ ืื ื ืกืชื›ืœ ืขืœ ืงื˜ืข ื”ืงื•ื“ ื”ื–ื” ืฉื ื•ืขืœ dfs_visit ื”ื™ื ืฉืงื˜ืข ื”ืงื•ื“ ื”ืงืจื™ื˜ื™ ื ื•ืขืœ ืืช ื›ืœ ื”threads ื‘ื™ืŸ ืื ืกื•ืจืงื™ื ืดืขื•ืžืงื™ืืด ืฉื•ื ื™ื ืœื’ืžืจื™ ื‘ืขืฅ. ืื™ืŸ ื›ืืŸ ื”ื‘ื—ื ื” ื‘ื™ืŸ ื”ืื ื™ืฉ ืฆื•ืจืš ืœื”ืฆื”ืจืช ืงื˜ืข ืงื•ื“ ืงืจื™ื˜ื™ ื›ื™ ืื ื—ื ื• ื ื’ืฉื™ื ืœืื•ืชื• ืงื•ื“ืงื•ื“ ืื• ืœื. ื ืจืฆื” ืœื‘ื ื•ืช ืžื ื’ื ื•ืŸ ืฉื ื•ืชืŸ ืžื ืขื•ืœ ืœื›ืœ ืงื•ื“ืงื•ื“.

LOCKS

ื™ืฉื ื ื›ืžื” ืคื•ื ืงืฆื™ื•ืช ืฉื ื™ืชืŸ ืœื”ืฉืชืžืฉ ื‘ื”ื ื‘ืขื‘ื•ื“ื” ืขื ืžื ืขื•ืœื™ื ื‘ OpenMP

Pasted image 20240310164907.png

ื ื‘ื—ืŸ ืื™ืš ื ื™ืชืŸ ืœื‘ืฆืข ืฉื™ื ื•ื™ ืœืงื•ื“ ื” DFS ืขื ืžื ืขื•ืœื™ื ื•ืœืงื‘ืœ ืจืžืช ืžืงื‘ื™ืœื™ื•ืช ื’ื‘ื•ื”ื” ื™ื•ืชืจ ืขืœ ื™ื“ื™ ื ืขื™ืœื” ืจืง ื‘ืขืช ื”ืฆื•ืจืš


global variables: omp_lock_t locks[|V|]
DFS(G=(V,E)) {
   #pragma omp parallel for
   for ๐‘ฃโˆˆ V
        visited[v] โ† WHITE
        omp_init_lock(&locks[v])

   for ๐‘ฃโˆˆ V
        if visited[v] = WHITE
           #pragma omp parallel
               #pragma omp single
                    dfs_visit (v)
   for ๐‘ฃโˆˆ V
       omp_destroy_lock(&locks[v])
}

dfs_visit (v) {
   visited[v] โ† BLACK

   for ๐‘ขโˆˆ neighbors[v]
        omp_set_lock(&locks[u])
        if visited[u] = WHITE
            visited[u] โ† GRAY
            #pragma omp task
                  dfs_visit(u)
            omp_unset_lock(&locks[u])
}

ื ืฉื™ื ืœื‘ ืฉืื ื—ื ื• ื ื•ืขืœื™ื ืจืง ืืช ื”ืžื ืขื•ืœ ืฉืœ ืงื•ื“ืงื•ื“ ืกืคืฆื™ืคื™ ืฉื ื›ื ืก ืœืชื ืื™ ื” if visited(u) , ื‘ืื•ืคืŸ ื”ื–ื”, ืื ื—ื ื• ืžื•ื ืขื™ื ืžืฉื ื™ threads ืœื”ื›ื ืก ืœืงื•ื“ ื”ืจืงื•ืจืกื™ื” ื›ื™ ื‘ื”ื›ืจื— ื”ืจืืฉื•ืŸ ืฉื™ื™ื›ื ืก ื™ืฉื ื” ืืช ื” flag ืž WHITE ืœ GRAY. ืชื•ืš ื›ื“ื™ ืื ื—ื ื• ืžืืคืฉืจื™ื ืœืฉื ื™ threads ืฉืกื•ืจืงื™ื ืงื•ื“ืงื•ื“ื™ื ืฉื•ื ื™ื, ืœื”ืžืฉื™ืš ื‘ืฆื•ืจื” ืžืงื‘ื™ืœื™ืช.

Atomic

ื“ืจืš ื ื•ืกืคืช ืœื”ืฉื™ื’ ืืช ืื•ืชื” ืชื•ืฆืื” ืจืฆื•ื™ื” ื”ื™ื ื‘ืืžืฆืขื•ืช Atomic instruction. ื ื–ื›ื™ืจ ืฉ ืคืขื•ืœื•ืช ืื˜ื•ืžื™ื•ืช ื”ืŸ ืคืชืจื•ืŸ ื—ื•ืžืจืชื™ ืœื‘ืขื™ืช ื” mutual exclusion ืฉืžื‘ื˜ื™ื—ื” ืฉืจืง thread ืื—ื“ ื™ื›ื•ืœ ืœื‘ืฆืข ืืช ื”ืคืขื•ืœื” ื”ืื˜ื•ืžื™ืช ื‘ื›ืœ ืจื’ืข ื ืชื•ืŸ ื•ื‘ื›ืš ืœื”ืžื ืข ืž race conditions.

ื”ื”ื‘ื“ืœ ื”ืžื”ื•ืชื™ ื‘ื™ืŸ ืคืขื•ืœื” ืื˜ื•ืžื™ืช ืœื‘ื™ืŸ ืžื ืขื•ืœื™ื ื”ื™ื ืฉื‘ืืžืฆืขื•ืช ืฉื™ืžื•ืฉ ื‘ืคืขื•ืœื•ืช ืื˜ื•ืžื™ื•ืช , ืื™ืŸ overhead ืฉืœ ื”ืงืคืืช ื”ืคืขื•ืœื” ืฉืœ threads ื•ืฉืžื™ืจืช ื”ืžืฆื‘ ืฉืœื”ื ืขื“ ืœืจื’ืข ืฉื”ื ื™ื›ื•ืœื™ื ืœืจื•ืฅ.
ื” mutual exclusion ื‘ atomic ื”ื•ื ื‘ืจืžืช ื”ื—ื•ืžืจื”, ืœืขื•ืžืช Locks ืื‘ืœ ื”ื•ื ื’ื ืžื—ื–ื™ืง ืืช ื” threads ื‘ busy wait (ืžืฆื‘ ืฉื‘ื• ื” thread ืžื‘ื–ื‘ื– ื–ืžืŸ ืžืขื‘ื“ ืขืœ ื”ืžืชื ื”).

Pasted image 20240310202839.png

ื”ืื˜ื•ืžื™ื•ืช ื”ื–ื• ืžืกื•ืคืงืช ืขืœ ื™ื“ื™ ื—ื•ืžืจื”, ืืฉืจ ืžืฉื‘ื™ืชื” ืืช ื”ื”ืคืจืขื•ืช ื‘ืžืขื‘ื“ ื”ืžืชืื™ื ื•ืžืฉื‘ื™ืชื” ืืช ื”ื’ื™ืฉื” ืœืžืขื‘ื“ -m (ืขืœ ื™ื“ื™ ื ืขื™ืœืช data bus), ืขื“ ืœื‘ื™ืฆื•ืข ื”ื•ืจืื” ืื˜ื•ืžื™ืช ื‘ืžืœื•ืื”.

Pasted image 20240310203017.png
ืžื” ืฉืื˜ื•ืžื™ ื‘ืชื•ืš ื”ื‘ืœื•ืง atomic ื”ื•ื ื”ืคืขื•ืœื” + ื‘ืงื•ื“ ื”ื ืดืœ. ื”ืคื•ื ืงืฆื™ื” do some work ื”ื™ื ืื™ื ื” ืื˜ื•ืžื™ืช ื•ืžืกืคืจ threads ืจืฆื™ื ื‘ืžืงื‘ื™ืœ.

Pasted image 20240310203222.png
ื ื™ืชืŸ ืœืจืื•ืช ืฉื›ืœ ื”- threads ืขื•ื‘ื“ื™ื ื‘ืžืงื‘ื™ืœ, ื•ืจืง ื”ื’ื™ืฉื” ืœ- x ื”ื™ื ื” ืžืกื•ื ื›ืจื ืช.

ื”ืกื™ื ืชืงืก ื”ื•ื ืžื”ืฆื•ืจื”

pragma omp atomic [read | write | update | capture]

Pasted image 20240310204407.png|250

ืœืžื” ืงืจื™ืื” ืื˜ื•ืžื™ืช ื–ื” ื—ืฉื•ื‘?

ืขืœ ืคื ื™ื• ื ืจืื” ืฉื”ืงืจื™ืื” ื”ืื˜ื•ืžื™ืช ื”ื™ื ืžื™ื•ืชืจืช ืฉื›ืŸ ืžื” ื”ื‘ืขื™ื” ืฉืžืกืคืจ threads ื™ืงืจืื• ืžืื•ืชื• ืขืจืš ื•ื™ื‘ืฆืขื• ื”ืฉืžื” ืฉืœื• ืœืžืฉืชื ื”? ื”ื‘ืขื™ื” ืžืชื—ื™ืœื” ื›ืืฉืจ ืชื—ื™ืœืช ื”ืขืจืš ืฉืœ ื”ืžืฉืชื ื” ื ืžืฆื ื‘ื›ืชื•ื‘ืช ืฉื”ื™ื ืœื align. ื‘ืžืฆื‘ ื–ื” ืขืœื•ืœ ืœืงืจื•ืช ืžืฆื‘ ืฉื‘ื• ืื ื—ื ื• ืžื‘ืฆืขื™ื ืžืกืคืจ ื’ื™ืฉื•ืช ืœRAM ืจืง ื›ื“ื™ ืœื”ื‘ื™ื ืขืจืš ืฉืœ ืžืฉืชื ื” ื•ื‘ืžืฆื‘ ื›ื–ื” ืื ื ืงืจื ืืช ื”ืขืจืš ืขืœื•ืœ ืœื”ื™ื•ืช ืžืฆื‘ ืฉืชื•ืš ื›ื“ื™ ืงืจื™ืื” ืฉืœ ื‘ื™ืช ืื—ื“ ืžื”ืžืฉืชื ื” ื”ื‘ื™ืช ื”ืฉื ื™ ื™ืฉืชื ื” ืชื•ืš ื›ื“ื™ ืื ืœื ื ื’ื“ื™ืจ ืืช ื”ืคืขื•ืœื” ื›ืื˜ื•ืžื™ืช (ืื• ื ืขื‘ื•ื“ ืขืaligned addresses).
ืœื“ื•ื’ืžื” - ืฉืงืจื™ืื•ืช ืžื–ื›ืจื•ืŸ ืžืชื‘ืฆืขื•ืช ื‘- chunks ืฉืœ 64 bits, ื”ื—ืœ ืžื›ืชื•ื‘ืช ืฉืžืชื—ืœืงืช ื‘- 8 . ืื ืžืฉืชื ื” short x ื ืžืฆื ืขืœ ื”ืชืคืจ, ื ืฆื˜ืจืš ืฉืชื™ ื’ื™ืฉื•ืช ืœื–ื›ืจื•ืŸ ื›ื“ื™ ืœื”ื‘ื™ื ืืช ืฉื ื™ ื”- bytes ืฉืœ x. ื‘ื™ืŸ ืงืจื™ืื” ืจืืฉื•ื ื” ืœืงืจื™ืื” ืฉื ื™ื™ื” ืขืœื•ืœ ืœื”ื›ื ืก thread ืื—ืจ ืฉืขืœื•ืœ ืœื”ืจื•ืก ืืช ื”ืงืจื™ืื” ืฉืœ x.
Pasted image 20240310231106.png|300
Pasted image 20240310231205.png|300
ื“ื•ื’ืžื ืœืชื–ืžื•ืŸ ืœื ืžื•ืฆืœื— ืฉืœ ืฉื ื™ threads, ื›ืืฉืจ thread ืื—ื“ ืžื ืกื” ืœืงืจืื• ืืช ื”ืขืจืš ืฉืœ x, ื•- thread ืฉื ื™ ืžื ืกื” ืœื›ืชื•ื‘ ืขืจืš ื—ื“ืฉ ืœ- x. ื‘ืกื•ืฃ ื”ืงืจื™ืื” v ืžืงื‘ืœ junk value.

ื ืจืื” ืฉื™ืžื•ืฉ ื‘ืคืขื•ืœื•ืช ืื˜ื•ืžื™ื•ืช ืœืžื™ืžื•ืฉ dfs
Pasted image 20240310232518.png
ื›ืืŸ ื”ืฉืชืžืฉื ื• ื‘ compare and capture ืคืขื ืื—ืช ื›ืœ ืคืขื ืฉื ื›ื ืกื™ื ืœ dfs visit. ืžืฆื‘ ื–ื” ื’ื•ืจื ืœื›ืš ืฉื›ืœ ืงื•ื“ืงื•ื“ ื‘ืฆื•ืจื” ืื˜ื•ืžื™ืช ืฉื•ืžืจ ืืช ื”old value ืฉืœ ืงื•ื“ืงื•ื“ ืฉื”ื•ื ืžื‘ืงืจ ื‘ื•ื ื•ื˜ื•ืขืŸ ื‘ืชื•ื›ื• ืขืจืš ื—ื“ืฉ. ืื ื”ืขืจืš ื”ื™ืฉืŸ ื”ื™ื” white ืื– ืžื‘ืงืจื™ื ื‘ื•. ื‘ื’ืœืœ ืฉืคืขื•ืœื” ื”capture ื”ื™ื ืื˜ื•ืžื™ืช, ืื– ืื™ืŸ ื—ืฉืฉ ืฉื”ืชื ืื™ ื™ืชืงื™ื™ื ืขื‘ื•ืจ ืฉื ื™ threads ืฉืžื‘ืงืจื™ื ื‘ืื•ืชื• ืงื•ื“ืงื•ื“.

Sorted insertion to a linkes list

ื ืจืฆื” ืžื‘ื ื” ื ืชื•ื ื™ื ืฉืœ ืจืฉื™ืžื” ืžืงื•ืฉืจืช ืฉืชื•ืžื›ืช ื‘ืžื™ืงื‘ื•ืœ.
ื ืจืฆื” ื‘ืžืงื•ื ืคื•ื ืงืฆื™ื™ืช ื”insert ื”ืจื’ื™ืœื” ืœืชืžื•ืš ื‘sorted_insert ืฉืžื›ื ื™ืก ืื™ื‘ืจื™ื ื‘ืฆื•ืจื” ืžืžื•ื™ื™ื ืช ืœืจืฉื™ืžื”.

ื‘ืงื•ื“ ืกื“ืจืชื™ ื–ื” ื™ืจืื” ื›ืš

void sorted_insert(node_t *head, int val) {
    /* Find the right place to insert */
    node_t *ptr = head->next;
    node_t *prev = head;

    while (ptr != NULL && ptr->value < val) {
        prev = ptr;
        ptr = ptr->next;
    }

    /* Insert the new node */
    node_t *new_node = malloc(sizeof(node_t));
    new_node->value = val;
    new_node->next = ptr;
    prev->next = new_node;
}

ืื ื—ื ื• ื ืงื“ื ืืช ื”ืžืฆื‘ื™ืข ื›ืœ ืขื•ื“ ื”ืขืจืš ื”ื ื•ื›ื—ื™ ืฉืœ ื”ืžืฆื‘ื™ืข ืœืื™ื‘ืจ ื‘ืจืฉื™ืžื” ืžืงื•ืฉืจืช ืงื˜ืŸ ืžื”ืขืจืš ืฉื ืจืฆื” ืœื”ื›ื ื™ืก. ื‘ืจื’ืข ืฉื”ื’ืขื ื• ื ื›ื ื™ืก ืืช ื” node ื”ื—ื“ืฉ.

ื ื ืกื” ืœื›ืชื•ื‘ ืืช ื”ืงื•ื“ ื”ื–ื” ื›ื“ื™ ืœืชืžื•ืš ื‘ืžืงื‘ื™ืœื™ื•ืช ื›ืžื” ืฉื™ื•ืชืจ ืžื”ื™ืจื” ื‘ืืžืฆืขื•ืช ื›ืœื™ ื”ืกื ื›ืจื•ืŸ ื”ืฉื•ื ื™ื.

V1

ื”ืžื™ืžื•ืฉ ื”ื›ื™ ื ืื™ื‘ื™ ื™ื”ื™ื” ืœื”ืฉืชืžืฉ ื‘ืžื ืขื•ืœ ืื—ื“ ื‘ืชื•ืš ื”ืคื•ื ืงืฆื™ื” ืฉืœ sorted_insert. ื”ืจื‘ื” ืคืขืžื™ื ื‘ืคืชืจื•ืŸ ืฉืœ ืืœื’ื•ืจื™ืชืžื™ื ืžืงื‘ื™ืœื™ื™ื ื ืจืฆื” ืœื”ืชื—ื™ืœ ืžืœื”ื‘ื™ืŸ ื‘ื›ืœืœ ืžื”ื• ื”ืคืชืจื•ืŸ ื”ื ืื™ื‘ื™ ื‘ื™ื•ืชืจ ื•ืœืคืชื— ืื•ืชื•.

void sorted_insert(node_t *head, int val) {
    /* Find the right place to insert */
    omp_set_lock(&lock);
    node_t *ptr = head->next;
    node_t *prev = head;

    while (ptr != NULL && ptr->value < val) {
        prev = ptr;
        ptr = ptr->next;
    }

    /* Insert the new node */
    node_t *new_node = malloc(sizeof(node_t));
    new_node->value = val;
    new_node->next = ptr;
    prev->next = new_node;
    omp_unset_lock(&lock);
}

ื”ืžื™ืžื•ืฉ ื”ื–ื” ื”ื•ืคืš ืืช ื”ื‘ืœื•ืง ื”ื–ื” ืœืกื“ืจืชื™ ื•ืœื›ืŸ ืจืžืช ื”ืžืงื‘ื™ืœื™ื•ืช ื–ื”ื” ืœืกื“ืจืชื™ ื•ื‘ืชื•ืกืคืช ื” overhead ืฉืœ ื™ืฆื™ืจืช threads , ื”ืงื•ื“ ืฉืœื ื• ื™ื”ื™ื” ื’ืจื•ืข ื‘ื”ืจื‘ื” ืžื”ืงื•ื“ ื”ืกื“ืจืชื™.

V2

ื”ื’ืจืกื” ื”ืฉื ื™ื™ื” ืฉืœื ื• ืชืฆื™ืข ืจืขื™ื•ืŸ ืขื ืžืงื‘ื™ืœื™ื•ืช ื™ื•ืชืจ ื’ื‘ื•ื”ื”.
ืื ื—ื ื• ื ืจืฆื” ืฉื›ืœ thread ืฉืงื•ืจื ืœืคื•ื ืงืฆื™ื” sorted_insert ืžืืคืฉืจ ืœ threads ืฉื™ื‘ื•ื ืื—ืจื™ื• ืœืขื‘ื•ื“ ืจืง ืขืœ ืจื™ืฉื ืฉืœ ื”ืจืฉื™ืžื”. ืžืขื™ืŸ ืžื ื’ื ื•ืŸ pipeline ื›ืš ืฉื‘ืžืงื•ื ืฉื›ืœ thread ื™ืฆื˜ืจืš ืœืžืฆื•ื ื”ื™ื›ืŸ ืœืžืงื ืืช ื”ืขืจืš ืฉืœื• ื‘ืจืฉื™ืžื” ื›ื“ื™ ืœืืคืฉืจ ืœ thread ื”ื‘ื ืœืจื•ืฅ, ื”ื•ื ื—ื•ืกื ืœthreads ืื—ืจื™ื ืจืง ืชืช ืจืฉื™ืžื” ื™ืžื ื™ืช ื•ืžืืคืฉืจ ืœื”ื ืœืขื‘ื•ื“ ืขืœ ื”ืฉืžืืœื™ืช ืขื“ ืฉื”ื•ื ืžืกื™ื™ื.

void sorted_insert(node_t *head, int val) {
    /* Signal start of search */
    omp_set_lock(head->lock);

    /* Find the right place to insert */
    node_t *ptr = head->next;
    node_t *prev = head;

    while (ptr != NULL && ptr->value < val) {
        omp_set_lock(ptr->lock);
        omp_unset_lock(prev->lock);
        prev = ptr;
        ptr = ptr->next;
    }

    /* Insert the new node */
    node_t *new_node = malloc(sizeof(node_t));
    new_node->value = val;
    new_node->next = ptr;
    new_node->lock = malloc(sizeof(omp_lock_t));
    omp_init_lock(new_node->lock);
    prev->next = new_node;
    omp_unset_lock(prev->lock);
}

ื›ืขืช ืœื›ืœ ืื™ื‘ืจ ื‘ืจืฉื™ืžื” ื™ืฉ ืžื ืขื•ืœ ืžืฉืœื•. ื›ืœ thread ืฉื ื›ื ืก ื ื•ืขืœ ืืช ื”ptr ืฉืขืœื™ื• ื”ื•ื ื›ืจื’ืข ื ืžืฆื ื‘ืจืฉื™ืžื” ื•ืžืฉื—ืจืจ ืืช ื”ืงื•ื“ื ื•ื›ื›ื” ื”ื•ื ืžืžืฉื™ืš.

V3

ื”ื’ืจืกื” ื”ืฉืœื™ืฉื™ืช ืฉืœ ืชืžื™ื›ื” ื‘ืžืงื‘ื™ืœื™ื•ืช ืฉืœ sorted_insert ืชืขื‘ื•ื“ ื‘ืื•ืคืŸ ื”ื‘ื-

ื ืืคืฉืจ ืœthreads ืœืžืฆื•ื ืืช ืžื™ืงื•ื ื”ptr ืฉืื—ืจื™ื• ืืžื•ืจ ืœื”ื›ื ืก ื”ืงื•ื“ืงื•ื“ ื”ื—ื“ืฉ ื‘ืฆื•ืจื” ืžืงื‘ื™ืœื™ืช ื‘ืœื™ ืžื ื’ื ื•ืŸ ืกื ื›ืจื•ืŸ.
ื‘ืจื’ืข ืฉthread ื›ืœืฉื”ื• ืžืฆื ืืช ื”ptr ื”ื–ื”, ื”ื•ื ืžื ืกื” ืœื ืขื•ืœ ืืช ื”ืžืฆื‘ื™ืข prev , ื–ื” ืฉืžืฆื‘ื™ืข ืœptr.

ื”ืกื™ื‘ื” ืฉื”ื•ื ืžื ืกื” ืœื ืขื•ืœ ืื•ืชื• ื”ื™ื ื‘ื’ืœืœ ืฉื™ื›ื•ืœ ืœื”ื™ื•ืช ืฉืชื•ืš ื›ื“ื™ ื”ื—ื™ืคื•ืฉ ืฉืœ ื”ptr ืงืจื” ืžืฆื‘ ืฉ threads ืื—ืจื™ื ื”ื•ืกื™ืคื• ืขืจื›ื™ื ื‘ื™ืŸ prev ืœ ptr ื•ื›ืขืช ื”ืžืฆื‘ ื›ื‘ืจ ืœื ืขื“ื›ื ื™.

ืืช ื–ื” ืืคืฉืจ ืœืขืฉื•ืช ืขืœ ื™ื“ื™ ื”ืฉื•ื•ืื” ื‘ื™ืŸ ื”ptr ืœื‘ื™ืŸ prev.next . ืื thread ืื—ืจ ื”ื•ืกื™ืฃ ืื™ื‘ืจ ื‘ื™ืŸ prev ืœ ptr ื”ื”ืฉื•ื•ืื” ื”ื–ืืช ืชื’ืœื” ืœื ื• ืืช ื–ื”. ื‘ืžืฆื‘ ื–ื” ืื ื—ื ื• ืžื‘ืฆืขื™ื ื‘ื™ื“ื™ื•ืง ืืช ืื•ืชื• ื”ืงื•ื“ ื‘ื’ืจืกื” v2 ืจืง ืฉื”ืคืขื ื–ื” ื›ื ืจืื” ื™ื”ื™ื” ืขืœ ืชืช ืจืฉื™ืžื” ืžืฆื•ืžืฆืžืช ื‘ื”ืจื‘ื”.

ืœื‘ืกื•ืฃ ืžื‘ืฆืขื™ื ื”ื›ื ืกื” ืฉืœ ื”ืขืจืš ื”ื—ื“ืฉ ืœืจืฉื™ืžื”.

void sorted_insert(node_t *head, int val) {
    /* Find the right place to insert */
    node_t *ptr = head->next;
    node_t *prev = head;

    while (ptr != NULL && ptr->value < val) {
        prev = ptr;
        ptr = ptr->next;
    }

    /* Lock the elment that may be modified */
    omp_set_lock(prev->lock);

    /* Confirm that the element is still in the right place */
    if (prev->next != ptr) {
        ptr = prev->next;
        while (ptr != NULL && ptr->value < val) {
            omp_set_lock(ptr->lock);
            omp_unset_lock(prev->lock);
            prev = ptr;
            ptr = ptr->next;
        }
    }

    /* Insert the new node */
    node_t *new_node = malloc(sizeof(node_t));
    new_node->value = val;
    new_node->next = ptr;
    new_node->lock = malloc(sizeof(omp_lock_t));
    omp_init_lock(new_node->lock);
    prev->next = new_node;
    omp_unset_lock(prev->lock);
}